Solving Sudoku With Ant Colony Optimization: IEEE Transactions On Games September 2019
Solving Sudoku With Ant Colony Optimization: IEEE Transactions On Games September 2019
net/publication/335954009
CITATIONS READS
5 412
2 authors:
Some of the authors of this publication are also working on these related projects:
MANTiCORE: The MMU Ant Colony Optimization Research Environment View project
All content following this page was uploaded by Martyn Amos on 11 November 2020.
I. I NTRODUCTION
(that is, 2×8=16 neighbours in the relevant row and column,
Sudoku is a well-known logic-based puzzle game that was plus 4 other cells occupying the same box; see Figure 2).
first published in 1979 under the name of “Number Place”. Sudoku is an NP-complete problem [3], as first shown in
It was popularised in Japan in 1984 by the puzzle company [4] via a reduction from the Latin Square Completion problem
Nikoli, and later named “Sudoku”, which roughly translates [5]. As such, the problem offers itself as a useful benchmark
to “single digits”. The puzzle gained attention in the West challenge, and a number of different types of algorithm have
in 2004, after The Times published its first Sudoku grid at been proposed for its solution (see the next Section for a
the instigation of Hong Kong-based judge Wayne Gould, who more detailed discussion of these). However, we also consider
first encountered the puzzle in 1997, and developed a computer the argument that “We should develop AI methods that work
program to automatically generate instances. Sudoku is now with not just one game, but with any game (within a given
a global phenomenon, and many newspapers now carry it range) that the method is applied to” [6]. That is, rather than
alongside their existing crosswords (see [1] for a general developing a multitude of algorithms to play one specific
history of the puzzle). game, we should seek methods that find broader applicability,
The simplest variant of Sudoku uses a 9×9 grid of cells across a range of games. Although the algorithm we present
divided into nine 3×3 subgrids (Figure 1 (left)). As we later here is demonstrated in the context of Sudoku, we later show
demonstrate, the problem scales to larger grids, but, for the how its lack of reliance on any heuristic information (that
moment, we focus on the most familiar variant. The aim of is, game-specific “hints”) means that it may be applied to a
the puzzle is to fill the grid with digits such that each row, number of different puzzle games.
each column, and each 3×3 subgrid contains all of the digits While such puzzle games may, superficially, appear to lack
1-9 (Figure 1 (right)). An instance of Sudoku provides, at the “real world” relevance, they in fact offer a significant challenge
outset, a partially-completed grid, but the difficulty of any grid for general-purpose AI methods; as argued in [6], “We need
derives more from the range of techniques required to solve it game playing benchmarks and competitions capable of ex-
than the number of cell values that are provided for the player. pressing any kind of game, including puzzle games, 2D arcade
Formally, a Sudoku problem of order n = 3 is made up of a games, text adventures, 3D action-adventures and so on; this
grid of cells (or squares), arranged into 3×3 subgrids known is the best way to test general AI capacities and reasoning
as boxes. A unit is a row, column or box, each containing skills.” While our algorithm could not be described as “general
exactly nine cells. A problem is solved when each unit (that purpose”, this does serve to underscore the importance of the
is, every row, column and box) contains a permutation of the puzzle game domain.
digits 1. . . 9 [2] The rest of the paper is structured as follows: in Sec-
Any given cell has exactly three units and 20 peers; the tion II we briefly review closely-related recent work on the
units are the row, column and box in which the cell resides, application of various algorithms to Sudoku. This motivates
and the set of peers is made up of the other cells in those units the description, in Section III of our own method, based on
Ant Colony Optimization (ACO), which introduces a novel
H. Lloyd is with the Department of Computing and Mathematics, operator which we call Best Value Evaporation. In Section IV
Manchester Metropolitan University, Manchester, UK (email: we present the results of experimental investigations, which
[email protected]).
M. Amos is with the Department of Computer and Information Sciences, confirm (1) that our algorithm out-performs existing methods,
Northumbria University, UK (email: [email protected]). and (2) that BVE is a necessary addition to the basic ACO
2475-1502 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Northumbria University Library. Downloaded on May 27,2020 at 10:42:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TG.2019.2942773, IEEE
Transactions on Games
IEEE TRANSACTIONS ON GAMES 2
Fig. 2. Units and peers for a specific highlighted cell. The units (from right to left, column, row, and block) are highlighted in white. The union of the three
units, that is all the white cells, are the peers.
algorithm for solving large Sudoku instances. We conclude likely to be selected. After a single population iteration, the
in Section V with a discussion of our findings, and discuss best solution according to some objective function is selected
possible future work in this area. from the population, and the components it contains (e.g.,
edges in the graph) are given additional pheromone. In this
II. R ELATED WORK way, the population rapidly converges on high-quality solu-
We first consider a “traditional” backtracking approach to tions, although premature or sub-optimal convergence is dis-
solving Sudoku. The Exact Cover Problem [7] is a type couraged through the continuous “evaporation” of pheromone
of constraint satisfaction problem which may be phrased as concentrations. Some ACO variants include local pheromone
follows: given a binary matrix, find a subset of rows in which operators, which allow individual ants to record information
each column sums to 1 (that is, find a set of rows in which about their traversal during the solution construction process;
each column contains only a single 1). In [8], Knuth describes for example, ants may reduce the global pheromone value
the “dancing links” implementation of his Algorithm X (called associated with components as they are added to a solution,
DLX), a “brute force” backtracking algorithm for Exact Cover. to discourage following ants from taking the same path.
As any Sudoku puzzle may be transformed into an instance The archetypal ACO algorithm was named “Ant system”
of Exact Cover [9], DLX naturally offers an effective solution [25], and this was applied to the well-known Travelling Sales-
method for Sudoku [10]. man Problem as follows: each edge connecting two cities has a
In [2], Peter Norvig presents an alternative approach, based pheromone value, and the probability of an edge being selected
on constraint propagation followed by a search process (we by an ant is a function of both its pheromone concentration
discuss this in more detail shortly). Other notable approaches and its distance from the ant’s current location. This process
to solving Sudoku include formal logic [11], an artificial thus combines the autocatalytic power of the global pheromone
bee colony algorithm [12], constraint programming [13], [14], network with a greedy local search heuristic. Each ant also
evolutionary algorithms [15], [16], [17], [18], particle swarm maintains a “tabu” list of cities that it has visited, and an ant
optimisation [19], [20], simulated annealing [21], tabu search may not re-visit any city on its list. Once it has visited all cities,
[22], and entropy minimization [23]. As this diverse set of an ant then deposits an amount of global pheromone which is
solution methods demonstrates, Sudoku offers a challenging inversely proportional to the length of its tour; that is, shorter
yet conceptually simple test-bed for the comparative analysis tours deposit more pheromone. Once all ants have completed
of algorithms for problems involving complex reasoning. this process, the global pheromone matrix is evaporated, thus
In this paper, we focus on the application of ACO to the gradually removing the remnants of sub-optimal tours that
solution of Sudoku. ACO is a population-based search method persist over time. Dorigo et al. [25] demonstrate that positive
inspired by the foraging behaviour of ants [24], [25], and feedback, combined with local search, can offer a heuristic
it has been successfully applied to a wide range of compu- that is robust, versatile, broadly applicable, and amenable
tational problems (see [26], [27] for overviews of both the to parallelization, because of its inherent population-based
algorithm and its applications). The basic ACO algorithm uses structure. Since the publication of the original paper, ACO
a population of “ants” (agents), which individually explore a is now a well-established method [28].
given problem space and incrementally construct a solution, In [29], Mantere presents a hybrid ACO/genetic algorithm
combined with a global “pheromone” data structure, which is approach to Sudoku, which combines global (evolutionary)
used to inform decisions taken by the ants. Essentially, each search with greedy local (ACO-based) search. Schiff [30]
ant moves individually on some problem representation (for and Sabuncu [31] also present relatively recent work on
example, a graph), gradually building a solution and proba- applying ACO to Sudoku, but, in both cases, the performance
bilistically choosing its next move according to pheromone of the algorithm is relatively poor. Another nature-inspired
concentrations. Components with more pheromone are more approach was used by [12], who used a variant of the artificial
2475-1502 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Northumbria University Library. Downloaded on May 27,2020 at 10:42:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TG.2019.2942773, IEEE
Transactions on Games
IEEE TRANSACTIONS ON GAMES 3
bee colony algorithm to solve 9 × 9 Sudoku puzzles. The 1) Eliminate from a cell’s value set all values that are fixed
algorithm was able to solve some difficult instances (such as in any of the cell’s peers.
the AIEscargot instance[32]) but the runtime performance is 2) If any values in a cell’s value set are in the only possible
relatively poor with an average solution time of over 6 minutes place in any of the cell’s units, then fix that value.
for difficult instances. Note that since this can lead to other cells having their values
For the purposes of comparison, in this paper we focus fixed, the procedure is recursive, and terminates when no
mainly on the work of Musliu, et al. [33], who present an further changes are possible.
iterated local search algorithm with constraint programming In Figure 3 we show the instance from Figure 1 after the
which represents the state-of-the-art in stochastic search algo- initial pass of our CP algorithm, which occurs when the board
rithms for the Sudoku problem, plus the algorithms of Knuth is set up, and before any search is performed. For easy cases,
[8] and Norvig [2]. the application of the CP algorithm is often sufficient to solve
the board, and no further search is required (see Section IV
for a discussion). However, in most cases, some search will
be required, and we now describe our ACO-based method for
this.
2475-1502 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Northumbria University Library. Downloaded on May 27,2020 at 10:42:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TG.2019.2942773, IEEE
Transactions on Games
IEEE TRANSACTIONS ON GAMES 4
Algorithm 1: Our ACO algorithm for Sudoku We give a pseudo-code description of our approach in
1 read in puzzle; Algorithm 1, components of which we now formally specify.
2 for all cells with fixed values do
3 propagate constraints (according to Section III-A); Line 5: For a Sudoku puzzle of dimension d we define
4 end a two-dimensional global pheromone matrix, τ , in which
5 initialize global pheromone matrix; each element is denoted as τik , where i is the cell index
6 while puzzle is not solved do (1 ≤ i ≤ d2 ) and k is a possible value for the cell (k ∈ [1, d]).
7 give each ant a local copy of puzzle; τik represents the pheromone level associated with value k in
8 assign each ant to a different cell; cell i. Each element of the matrix is initialised to some fixed
9 for number of cells do value, τ0 (we use a value of 1/c, where c = d2 is the total
10 for each ant do number of cells on the board).
11 if current cell value not fixed then
12 choose value from current cell’s value set; Line 12: Where an ant has a choice of a number of values
13 fix cell value; in an “open” cell (i.e., one which does not yet have its value
14 propagate constraints; fixed), then we define the value set, vi of cell i as the set of all
15 update local pheromone; available values for that cell, from which we have to choose
16 end one. We have a choice of two methods to use when making
17 move to next cell; a selection; we might make a greedy selection, in which case
18 end the member of vi with the highest pheromone concentration
19 end is selected, or we might make a weighted (i.e., “roulette
20 find best ant; wheel”) selection, in which case the selection probabilities are
21 do global pheromone update; proportional to the pheromone associated with the available
22 do best value evaporation; choices. The relative probabilities of each type of selection
23 end are determined by the greediness parameter, q0 ∈ [0, 1]. A
value selection, s, is therefore made according to
(
argmaxk∈vi {τik } if q < q0
After the cell’s value is set, the standard ACS local s= (1)
R otherwise
pheromone operator is applied, which reduces the probability
of that value being selected by the following ant, thus prevent- where q ∈ [0, 1] is a uniform random deviate, and R is a se-
ing early convergence. lection from vi made according to the probability distribution
Once all ants have covered every square of the board, we
τk
then perform the global pheromone update, which rewards pki = Pi j , k ∈ vi (2)
only the best solution found so far (the global best, in line τi
j∈vi
with ACS principles). We characterise the “best” solution, at
each iteration, as the sequence of value selections that lead to where pki is the probability of selecting choice k from vi .
the greatest number of cells having their values fixed; the best If a cell has a value set of size zero (that is, it cannot have
solution is effectively the one found by the ant that “guesses” its value fixed due to other cells being fixed and the constraints
correctly the highest number of times. However, at this point, thus introduced), then we mark it as a “fail cell”; the number
we introduce a novel variation to the standard ACS algorithm, of fail cells is later subtracted from the number of cells to be
which we call best value evaporation (BVE). In what follows, fixed when we calculate the quality of a solution (see note
“best value” refers to an amount of pheromone that is added below, for Line 20).
to the global pheromone matrix whenever the best solution is
identified within a generation, and this value is itself subject Line 15: The local pheromone update operator is used to
to evaporation, along with the component pheromone values. make selected values less attractive in subsequent iterations,
In standard ACS, the global pheromone operator increases thus promoting exploration of the solution space. The local
the pheromone concentrations of all components of the global pheromone update is handled as follows; every time an ant
best solution with an amount of pheromone that is directly selects a value, s, at cell i, its pheromone value in the matrix
proportional to the absolute quality of that solution. However, is updated as follows:
this can gradually lead to stagnation, where all ants end up
selecting the same route. Instead, the amount of pheromone τis ← (1 − ξ)τis + ξτ0 (3)
that is added globally, which we call the best value, is
measured in terms of the proportionate quality of the best with ξ = 0.1 (the standard setting for ACS).
solution found so far (Equation 5). Importantly, the best value
itself is subject to evaporation over time, which prevents Line 20: In order to perform the global pheromone update,
“lock in”; taken together, these two components of BVE we must first find the best-performing ant. At each iteration,
prevent premature stagnation, which is confirmed by our later each ant n of the m ants keeps track of the number of cells,
experimental observations. fn , n ∈ {1 . . . m}, that it has managed to set to a specific
2475-1502 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Northumbria University Library. Downloaded on May 27,2020 at 10:42:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TG.2019.2942773, IEEE
Transactions on Games
IEEE TRANSACTIONS ON GAMES 5
value. The value of fn corresponding to the iteration-best ant conducted by [33] of their algorithm against a number of
is fbest , given by competitors, and gives a measure of the practical applicability
of the algorithm in a time-constrained environment. In all
fbest = max fn . (4)
n∈{1...m} cases, we measured the statistical significance of results using
non-parametric tests, with a p value threshold for significance
We then calculate the amount of pheromone to add, ∆τ , as
of 0.05. In cases where multiple algorithms are compared
follows:
c together, this significance threshold was modified using the
∆τ = (5)
c − fbest Bonferroni correction. In comparing vectors of solution times,
where c is the total number of cells on the board. If the value of we use the Mann-Whitney U test in cases where the vectors
∆τ exceeds the current “best pheromone to add” value, ∆τbest have different lengths, which occurs when the success rates in
(a quantity initialized to 0 at the beginning of the run), then an experiment differ. This test is appropriate for determining
we set ∆τbest ← ∆τ , and replace the current best solution significance of differences in the means of differently-sized
with the solution found by the iteration-best ant. samples, when the distribution cannot be assumed to be nor-
Line 20: We then update all pheromone values corresponding mal. In cases where all algorithms solved all the instances, we
to values in the current best solution, where ρ ∈ [0, 1] is the use the Wilcoxon-signed rank test, which tests for significance
standard evaporation parameter: of difference in the means of paired observations, again with
no assumption on the distribution. The success rates are treated
as frequencies of a nominal variable (success/fail) for which
τis ← (1 − ρ)τis + ρ∆τbest . (6)
the Pearson χ2 test is appropriate.
Note that in ACS, there is no global evaporation of pheromone;
the global pheromone update (equation 6) is only applied
to pheromone values corresponding to fixed values in the A. Experimental environment
best solution; the evaporation parameter ρ represents the All of the codes were compiled using the same compiler and
“volatility” of the deposited pheromone, and is used to tune optimisation setting (g++ v5.4.0 with -O3). Experiments were
the convergence rate of the algorithm. run on a machine with an Intel Xeon E5-2460v4 processor
with a clock speed of 2.4GHz, running Ubuntu Linux. The
Line 22: In order to prevent “lock in”, we then additionally parameter settings for the iterated local search solver (ILS)
apply evaporation to the current best pheromone value, ∆τbest : were taken from the recommendations given in [33]. For the
ant colony code (ACS), we used the following settings: ρ =
∆τbest ← ∆τbest × (1 − ρBVE ) (7) 0.9, q0 = 0.9, ρBV E = 0.005, m = 10. Our code, and all the
instance files used for the experiments, may be downloaded
where ρBVE ∈ [0, 1] is a parameter which controls the rate of
from https://github.com/huwlloyd-mmu/sudoku acs.
evaporation of the best pheromone value.
2475-1502 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Northumbria University Library. Downloaded on May 27,2020 at 10:42:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TG.2019.2942773, IEEE
Transactions on Games
IEEE TRANSACTIONS ON GAMES 6
TABLE I
S OLUTION TIMES ( MEANAND STANDARD DEVIATION TIME OVER 100 RUNS ) FOR THE LOGIC - SOLVABLE INSTANCES . F IGURES IN BOLD INDICATE TIMES
WHICH ARE SIGNIFICANTLY LOWER FOR ONE ALGORITHM COMPARED THE OTHER THREE , BASED ON A W ILCOXON SIGNED RANK TEST; THE
B ONFERONNI CORRECTION IS APPLIED , SO THAT p VALUES LESS THAN 0.05/3 ARE TAKEN TO BE SIGNIFICANT. A STERISKS SHOW CASES IN WHICH THE
TIMES FOR ILS OR ACS ARE SIGNIFICANTLY LOWER THAN THE OTHER , USING THE W ILCOXON SIGNED RANK TEST WITH p < 0.05.
Solution Time/s
Instance ACS ILS DLX BS
sabuncu1 (4.8 ± 1.84) × 10−5∗ 0.00083 ± 0.00047 0.00105 ± 0.000362 (1.58 ± 0.651) × 10−6
sabuncu2 (4.82 ± 1.73) × 10−5∗ 0.00414 ± 0.00143 0.000937 ± 0.000308 (2.18 ± 6.45) × 10−6
sabuncu3 0.000993 ± 0.000457∗ 0.112 ± 0.0296 0.00104 ± 0.000366 0.000202 ± 0.0000775
sabuncu4 0.000625 ± 0.000708∗ 0.00859 ± 0.00229 0.00112 ± 0.000346 0.0001 ± 0.0000378
sabuncu5 (4.62 ± 1.5) × 10−5∗ 0.00097 ± 0.000556 0.00101 ± 0.000384 (1.68 ± 0.733) × 10−6
sabuncu6 0.0107 ± 0.00828∗ 0.105 ± 0.027 0.00153 ± 0.00045 0.000775 ± 0.000273
sabuncu7 0.00106 ± 0.000986∗ 0.0853 ± 0.0206 0.00102 ± 0.000318 (9.67 ± 3.75) × 10−5
sabuncu8 0.000728 ± 0.000343∗ 0.007 ± 0.00206 0.00107 ± 0.000374 (7.91 ± 2.74) × 10−5
sabuncu9 0.00163 ± 0.0014∗ 0.0153 ± 0.00437 0.00105 ± 0.000345 0.00016 ± 0.0000579
sabuncu10 (4.73 ± 1.85) × 10−5∗ 0.00136 ± 0.000641 0.00104 ± 0.000363 (1.6 ± 0.693) × 10−6
aiescargot 0.0204 ± 0.0152∗ 0.152 ± 0.0328 0.00208 ± 0.000648 0.000475 ± 0.000182
coly013 0.0488 ± 0.0518∗ 0.702 ± 0.0685 0.007 ± 0.00146 0.0278 ± 0.00517
goldennugget 0.0374 ± 0.0293∗ 0.442 ± 0.0918 0.00545 ± 0.00149 0.0152 ± 0.00304
platinumblond 0.113 ± 0.0859∗ 0.131 ± 0.0223 0.0059 ± 0.00152 0.00268 ± 0.000923
reddwarf 0.0404 ± 0.0354∗ 0.299 ± 0.0768 0.00514 ± 0.00132 0.00993 ± 0.00212
tarx0134 0.0259 ± 0.0193∗ 0.851 ± 0.0699 0.0185 ± 0.00303 0.038 ± 0.0074
The ten puzzles from [31] (sabuncu1–sabuncu10) are gen- each row, column and subgrid must contain all of the digits
erally solved in less time by all the algorithms than the six 1 . . . 16 and 1 . . . 25 respectively.
harder puzzles. In four cases (sabuncu1, sabuncu2, sabuncu5 These instances were generated by running the ACS code
and sabuncu10) the puzzle is solved by a single application of with an initially blank grid, to produce a set of Sudoku
our constraint propagation procedure, so that no searching is solutions. These are then converted into problem instances
required for either the ACS or BS algorithms. The difference by randomly blanking a number of the cells. The instances
in runtimes between the two algorithms for these instances generated in this way are not guaranteed to have a unique
may be explained by the difference in set-up times; in the solution. For each of the sizes 9 × 9, 16 × 16 and 25 × 25, we
case of ACS, the overhead of creating the ant colony and generated 100 instances for fixed cell fractions in steps of 0.05
initializing the pheromone matrix is clearly significant. On from 0 to 0.95, giving a total of 6000 individual instances. We
these four “trivial” instances, the BS algorithm is the fastest of ran the ACS, ILS, DLX and BS codes once on each instance,
all (running in times of order a microsecond). DLX requires at with timeouts set to 5 seconds for the 9 × 9 instances, 20
least of order a millisecond to solve all the puzzles; in all but seconds for 16 × 16 and 120 seconds for 25 × 25. These
the most difficult cases, this time is most likely dominated by timeouts are shorter than those used by [33]; however we
the calculations to convert the instance to and from an instance ran our experiments on a faster processor, and with compiler
of the exact cover problem. optimisations enabled. Taken together, these two differences
Overall, we find that the deterministic solvers perform best should amount to a factor of approximately 3 in time. We
on these instances. Either DLX or BS is significantly fastest designed the experiment so that each instance is used for one
for all of the instances. BS is the best performing overall, and run; this is preferable to carrying out multiple runs on each of
is fastest in twelve of the sixteen instances, with DLX fastest a smaller number of instances [37].
in the other four. ACS is significantly faster than ILS in all Figures 4, 5 and 6 show the results for average execution
cases, and faster than DLX in seven of the sixteen instances. time (for successful runs) and success rate for the four
Finally, we note that the times reported by Sabuncu[31] for algorithms. Summary results are given in Table II and the
their ACO algorithm to solve ten of the instances used here are raw data is given in Table III. In Table III, we indicate in bold
typically 1 to 3 seconds. This is several orders of magnitude quantities which are significantly best of all algorithms, and
slower than our times using ACS for the same instances which with asterisks significant differences between the stochastic
are of the order of milliseconds, or less; this is more than can algorithms ACS and ILS. Statistical significance is tested using
be accounted for by differences in hardware or efficiency of the χ2 contingency test for the success rates, and the Mann-
implementation and although we have not performed a direct Whitney U test for the solution times. We use the Mann-
comparison with their code, we can safely assume that our Whitney test here as the vectors of times will in general have
algorithm is the better performing of the two. differing lengths. In cases where we test all algorithms against
each other, we apply the Bonferroni correction to modify the
p-value threshold for signficance.
C. General instances As in [33] and [14], we observe a “phase transition” in
Following [14] and [33], we generated random instances for the difficulty of the instances as a function of the fixed cell
the 9 × 9, 16 × 16 and 25 × 25 Sudoku problem. In the latter fraction; the difficulty is markedly greater at fixed cell fractions
two cases, subgrids are of size 4×4 and 5×5 respectively, and of around 40 − 50%. For low values of the fixed cell fraction,
2475-1502 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Northumbria University Library. Downloaded on May 27,2020 at 10:42:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TG.2019.2942773, IEEE
Transactions on Games
IEEE TRANSACTIONS ON GAMES 7
Fig. 4. Plots of solution time (left) and success rate (right) against fixed cell Fig. 5. Plots of solution time (left) and success rate (right) against fixed
percentage for runs of ACS, ILS, DLX and BS on the 9 × 9 general instances. cell percentage for runs of ACS, ILS, DLX and BS on the 16 × 16 general
instances.
the search space is large, but there also exist many possible
solutions. As the grid becomes denser, the size of search space
decreases as well as the number of possible solutions. At
around 45%, the combination of rarity of solutions and the
size of the search space leads to a sharp peak in difficulty.
The most difficult puzzles are the 25 × 25 instances with a
fixed cell fraction between 40% and 50%. For these fixed cell
fractions of 40% and 45%, ACS outperforms the other three
algorithms by a significant margin; ACS achieves success rates
of 98% and 85% (compared to 69% and 10% for ILS, 76%
and 49% for DLX, and 21% and 12% for BS). These are
Fig. 6. Plots of solution time (left) and success rate (right) against fixed
the only instances in all the experiments presented for which cell percentage for runs of ACS, ILS, DLX and BS on the 25 × 25 general
one algorithm achieved a significantly higher success rate than instances.
the other three. The mean times achieved by ACS on these
instances are lower than the other three algorithms, but the
difference is not statistically significant – this is most likely solvable instances, and the 25 × 25 general instances (since
due to the small samples of times for the three algorithms these are the most challenging). For the named 9 × 9 logic-
which recorded low numbers of successes. solvable instances, we find that ACS without BVE performs
It is interesting to note the difference in performance be- very poorly on the harder instances (aiescargot, coly013,
tween ACS and BS. These two codes use the same underlying goldennugget, platinumblond, reddwarf, tarx0134), failing to
problem representation and constraint propagation code; the solve these in most cases (see Table IV). Performance on the
only difference between them is the search strategy. This ten instances from [31] is similar to BVE, with the exception
comparison is compelling evidence that ACS is very efficient of sabuncu6, with a success rate of 95%. This suggests that
at searching the solution space, giving markedly improved these ten instances are not sufficiently difficult to provide a
performance on the hardest instances over an exhaustive search
strategy using the same underlying evaluation routines. For
the easier instances, BS outperforms ACS, perhaps due to TABLE II
the simplicity of the algorithm which requires very little S UMMARY OF RESULTS ON THE GENERAL INSTANCES (20 FILLED CELL
FRACTIONS FOR ORDERS 3, 4 AND 5). T HE TABLE SHOWS THE NUMBER
setup compared to ACS, or transformation to another problem OF INSTANCE CATEGORIES FOR WHICH THE ALGORITHM LISTED IN THE
representation, as in DLX. FIRST COLUMN (A LGORITHM 0) PRODUCES A SIGNIFICANTLY HIGHER
ACS returns significantly lower runtimes than ILS, the other SUCCESS RATE , OR LOWER MEAN SOLUTION TIME , THAN THE OTHER
ALGORITHMS (A LGORITHM 1, OR ALL ALGORITHMS ). S IGNIFICANCE IS
stochastic search algorithm, in 52 of the 60 instances, whereas TAKEN AT THE 0.05 LEVEL FOR PAIRWISE COMPARISONS , OR 0.05/3 FOR
ILS is significantly faster than ACS for only two instances. The ONE - AGAINST- ALL COMPARISONS . T HIS DATA IS SUMMARIZED FROM
performance of ACS on these general instances is significantly TABLE III; DETAILS OF THE STATISTICAL TESTS ARE GIVEN IN
S ECTION IV.
better than that of ILS both in terms of overall runtime, and
success rate on the hardest instances. Algorithm 1
Algorithm 0 ACS ILS DLX BS All
ACS - 3 3 3 2
D. Evaluation of Best Value Evaporation Success ILS 0 - 0 0 0
DLX 0 0 - 0 0
In order to evaluate the effectiveness of BVE as an anti- BS 0 0 0 - 0
stagnation mechanism, we ran experiments using the logic ACS - 52 37 11 2
solvable instances (section IV-B) and general instances (sec- Time ILS 3 - 21 3 0
DLX 21 38 - 9 5
tion IV-C) using the ACS algorithm with best-value evapora- BS 48 52 45 - 40
tion disabled by setting ρBV E = 0. We used all the logic-
2475-1502 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Northumbria University Library. Downloaded on May 27,2020 at 10:42:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TG.2019.2942773, IEEE
Transactions on Games
IEEE TRANSACTIONS ON GAMES 8
TABLE III
S OLUTION RATES ( SOLVED INSTANCES OUT OF 100) AND TIMES ( MEAN AND STANDARD DEVIATION TIME OF SUCCESSFUL RUNS ) FOR THE GENERAL
INSTANCES . O IS THE ORDER OF THE PUZZLE (3 FOR 9 × 9, 4 FOR 16 × 16, 5 FOR 25 × 25) AND F IS THE PERCENTAGE OF GIVEN CELLS . F IGURES IN
BOLD DENOTE QUANTITIES FOR WHICH ONE ALGORITHM IS SIGNIFICANTLY SUPERIOR TO THE OTHER THREE . F OR THE SOLUTION TIMES , THE VECTORS
OF TIMES ARE COMPARED USING THE M ANN -W HITNEY U TEST. S UCCESS RATES ARE COMPARED USING A χ2 CONTINGENCY TEST. I N ALL CASES , THE
B ONFERONNI CORRECTION IS APPLIED , SO THAT p VALUES LESS THAN 0.05/3 ARE TAKEN TO BE SIGNIFICANT. A STERISKS SHOW QUANTITIES FOR
WHICH EITHER ILS OR ACS ARE SIGNIFICANTLY SUPERIOR TO THE OTHER , USING THE SAME TESTS WITH p < 0.05.
2475-1502 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Northumbria University Library. Downloaded on May 27,2020 at 10:42:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TG.2019.2942773, IEEE
Transactions on Games
IEEE TRANSACTIONS ON GAMES 9
TABLE IV
P ERFORMANCE OF ACS WITH AND WITHOUT BVE ON THE SIXTEEN 9 × 9 LOGIC - SOLVABLE INSTANCES AND GENERAL 25 × 25 INSTANCES .
S UCCESS % IS THE NUMBER OF SUCCESSFUL SOLUTIONS FOUND IN 100 RUNS . T IMES ARE GIVEN IN SECONDS , WITH MEAN AND STANDARD DEVIATION
OVER 100 RUNS . N UMBERS IN BOLD INDICATE STATISTICALLY SIGNIFICANT DIFFERENCES BETWEEN THE ALGORITHMS , DETERMINED USING THE
M ANN -W HITNEY U TEST FOR THE TIMES , AND χ2 CONTINGENCY TEST FOR THE SUCCESS RATES .
good benchmark for solution algorithms: the search space after of solutions. Experiments show that our new algorithm signif-
applying constraints is either too small or, as is the case for icantly out-performs existing algorithms on the hardest, large
four of the instances, non-existent. instances of Sudoku, and we provide evidence that our method
We also evaluated BVE using the general 25 × 25 instances. provides a much more efficient search of the solution space
We see that the performance of ACS is significantly degraded than traditional backtracking algorithms for these problems.
without the BVE operator. Performance with respect to solu- For smaller or easier instances, we find that direct search
tion time is degraded to some extent, with significantly shorter algorithms such as Dancing Links or Backtracking Search
times without BVE in three fixed cell fractions, compared outperform stochastic algorithms, but these deterministic al-
to nine which are faster with BVE. The number of failures gorithms perform poorly on the hardest instances. Finally, we
is significantly higher; for the 45% fixed cell instances for find that our algorithm outperforms the state of the art Iterated
example, the success rate is 58%, compared to 92% with BVE Local Search algorithm [33] both in terms of runtime and
enabled. The average solution time for these instances is 9.1s, success rates on hard instances.
well within the timeout of 120s, suggesting that the failures The growing body of work on the automated solution of
are due to the search stagnating at a local minimum. pencil puzzles such as Sudoku and Nurikabe suggests that they
offer a ready-made algorithmic test-bed. As such, they may
provide an additional challenge for general-purpose algorithms
V. C ONCLUSIONS
(whether AI-based or not), and offer new insights into the
In this paper we presented a new algorithm for the Sudoku solution of constraint satisfaction problems (by, for example,
puzzle, based on Ant Colony Optimization. Our method in- suggesting new ways in which to search the solution space).
cludes a new operator, which we call Best Value Evaporation, Importantly, solvers such as ours can out-perform state-of-
and we show that this addition to the base algorithm is essen- the-art methods without any requirement for problem-specific
tial for the prevention of premature convergence or stagnation heuristics, which immediately offers two possibilities for fu-
2475-1502 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Northumbria University Library. Downloaded on May 27,2020 at 10:42:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TG.2019.2942773, IEEE
Transactions on Games
IEEE TRANSACTIONS ON GAMES 10
ture work in this area. The first is a “problem agnostic” general [21] Z. Karimi-Dehkordi, K. Zamanifar, A. Baraani-Dastjerdi, and
Japanese pencil puzzle solver, which can solve large instances N. Ghasem-Aghaee, “Sudoku using parallel simulated annealing,”
in International Conference in Swarm Intelligence (ICSI). Springer,
of any problem in this class. By constructing this solver in a 2010, pp. 461–467.
modular fashion, we should easily be able to incorporate any [22] R. Soto, B. Crawford, C. Galleguillos, E. Monfroy, and F. Paredes, “A
suitable pencil puzzle, which will minimize the amount of hybrid ac3-tabu search algorithm for solving Sudoku puzzles,” Expert
Systems with Applications, vol. 40, no. 15, pp. 5817–5821, 2013.
effort required in future research. Importantly, this will allow [23] J. Gunther and T. Moon, “Entropy minimization for solving Sudoku,”
for the rapid (and experimentally consistent) solution of a wide IEEE Transactions on Signal Processing, vol. 60, no. 1, pp. 508–513,
range of pencil puzzles, which will (a) yield good solutions to 2012.
[24] M. Dorigo and G. Di Caro, “Ant colony optimization: a new meta-
these problems per se, (b) allow for easy comparison of the heuristic,” in Proceedings of the 1999 Congress on Evolutionary Com-
properties of those problems, and (c) provide a ready-made putation (CEC), vol. 2. IEEE, 1999, pp. 1470–1477.
platform for the subsequent investigation of problem-specific [25] M. Dorigo, V. Maniezzo, and A. Colorni, “Ant system: optimization by
a colony of cooperating agents,” IEEE Transactions on Systems, Man,
heuristics. and Cybernetics, Part B (Cybernetics), vol. 26, no. 1, pp. 29–41, 1996.
[26] M. Dorigo and M. Birattari, “Ant colony optimization,” in Encyclopedia
of Machine Learning. Springer, 2011, pp. 36–39.
[27] M. López-Ibáñez, T. Stützle, and M. Dorigo, “Ant colony optimization:
R EFERENCES A component-wise overview,” Handbook of Heuristics, pp. 1–37, 2016.
[28] M. Dorigo and T. Stützle, “Ant colony optimization: overview and recent
[1] J.-P. Delahaye, “The science behind Sudoku,” Scientific American, vol. advances,” in Handbook of Metaheuristics. Springer, 2019, pp. 311–
294, no. 6, pp. 80–87, 2006. 351.
[2] P. Norvig, “Solving every Sudoku puzzle,” available at [29] T. Mantere, “Improved ant colony genetic algorithm hybrid for Sudoku
http://norvig.com/sudoku.html, accessed March 13, 2018. solving,” in Third World Congress on Information and Communication
[3] M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide Technologies (WICT). IEEE, 2013, pp. 274–279.
to the Theory of NP-Completness. WH Freeman: New York, 1979. [30] K. Schiff, “An ant algorithm for the Sudoku problem,” Journal of
[4] T. Yato and T. Seta, “Complexity and completeness of finding an- Automation, Mobile Robotics and Intelligent Systems, vol. 9, 2015.
other solution and its application to puzzles,” IEICE Transactions on [31] I. Sabuncu, “Work-in-progress: solving Sudoku puzzles using hybrid
Fundamentals of Electronics, Communications and Computer Sciences, ant colony optimization algorithm,” in 1st International Conference on
vol. 86, no. 5, pp. 1052–1060, 2003. Industrial Networks and Intelligent Systems (INISCom). IEEE, 2015,
[5] C. J. Colbourn, “The complexity of completing partial Latin squares,” pp. 181–184.
Discrete Applied Mathematics, vol. 8, no. 1, pp. 25–30, 1984. [32] A. Inkala, AI Escargot - The Most Difficult Sudoku Puzzle. Lulu.com,
[6] G. N. Yannakakis and J. Togelius, Artificial Intelligence and Games. Finland, 2007.
Springer, 2018. [33] N. Musliu and F. Winter, “A hybrid approach for the Sudoku problem:
[7] R. M. Karp, “Reducibility among combinatorial problems,” in Complex- using constraint programming in iterated local search,” IEEE Intelligent
ity of Computer Computations. Springer, 1972, pp. 85–103. Systems, vol. 32, no. 2, pp. 52–62, 2017.
[8] D. E. Knuth, “Dancing links,” arXiv preprint cs/0011047, 2000. [34] M. Dorigo and L. M. Gambardella, “Ant colony system: a cooperative
[9] M. Hunt, C. Pong, and G. Tucker, “Difficulty-driven Sudoku puzzle learning approach to the Traveling Salesman Problem,” IEEE Transac-
generation,” UMAP Journal, vol. 29, no. 3, pp. 343–361, 2007. tions on Evolutionary Computation, vol. 1, no. 1, pp. 53–66, 1997.
[10] S. Fletcher, F. Johnson, and D. R. Morrison, “Taking the mystery out of [35] J. Laire, “dlx-cpp,” available at https://github.com/jlaire/dlx-cpp, ac-
Sudoku difficulty: an Oracular model,” UMAP Journal, vol. 29, no. 3, cessed April 23, 2018.
pp. 327–341, 2007. [36] M. Ercsey-Ravasz and Z. Toroczkai, “The chaos within Sudoku,” Sci.
[11] T. Weber, “A SAT-based Sudoku solver,” in The 12th International Con- Rep., vol. 2, pp. 725–733, 2012.
ference on Logic for Programming, Artificial Intelligence, and Reasoning [37] M. Birattari, “On the estimation of the expected performance of a
(LPAR): Short Paper Proceedings, G. Sutcliffe and A. Voronkov, Eds., metaheuristic on a class of instances. how many instances, how many
2005, pp. 11–15. runs?” IRIDIA, Université Libre de Bruxelles, Brussels, Belgium, Tech.
Rep. TR/IRIDIA/2004-001, 2004.
[12] J. A. Pacurib, G. M. M. Seno, and J. P. T. Yusiong, “Solving Sudoku
puzzles using improved artificial bee colony algorithm,” in Fourth
International Conference on Innovative Computing, Information and
Control (ICICIC). IEEE, 2009, pp. 885–888.
[13] B. Crawford, M. Aranda, C. Castro, and E. Monfroy, “Using constraint Huw Lloyd is a Senior Lecturer at Manchester
programming to solve Sudoku puzzles,” in Third International Con- Metropolitan University. He was awarded his B.Sc.
ference on Convergence and Hybrid Information Technology (ICCIT), in Physics by Imperial College, London, and his
vol. 2. IEEE, 2008, pp. 926–931. Ph.D. in Astrophysics by the University of Manch-
[14] R. Lewis, “Metaheuristics can solve Sudoku puzzles,” Journal of Heuris- ester.
tics, vol. 13, no. 4, pp. 387–401, 2007.
[15] X. Q. Deng and Y. Da Li, “A novel hybrid genetic algorithm for solving
Sudoku puzzles,” Optimization Letters, vol. 7, no. 2, pp. 241–257, 2013.
[16] T. Mantere and J. Koljonen, “Solving, rating and generating Sudoku
puzzles with GA,” in IEEE Congress on Evolutionary Computation
(CEC). IEEE, 2007, pp. 1382–1389.
[17] C. Segura, S. I. V. Peña, S. B. Rionda, and A. H. Aguirre, “The
importance of diversity in the application of evolutionary algorithms to
the Sudoku problem,” in IEEE Congress on Evolutionary Computation
(CEC). IEEE, 2016, pp. 919–926. Martyn Amos is Professor of Computer and Infor-
[18] Z. Wang, T. Yasuda, and K. Ohkura, “An evolutionary approach to Su- mation Sciences at Northumbria University. He was
doku puzzles with filtered mutations,” in IEEE Congress on Evolutionary awarded his B.Sc. in Computer Science by Coventry
Computation (CEC). IEEE, 2015, pp. 1732–1737. University, and his Ph.D. in DNA computation by
[19] J. M. Hereford and H. Gerlach, “Integer-valued particle swarm optimiza- the University of Warwick. He is a Fellow of the
tion applied to Sudoku puzzles,” in IEEE Swarm Intelligence Symposium British Computer Society.
(SIS). IEEE, 2008, pp. 1–7.
[20] A. Moraglio and J. Togelius, “Geometric particle swarm optimization
for the Sudoku puzzle,” in Proceedings of the 9th Annual Conference
on Genetic and Evolutionary Computation (GECCO). ACM, 2007, pp.
118–125.
2475-1502 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Northumbria University Library. Downloaded on May 27,2020 at 10:42:37 UTC from IEEE Xplore. Restrictions apply.
View publication stats