An Evaluation Function For The Game of Amazons: Jens Lieberum
An Evaluation Function For The Game of Amazons: Jens Lieberum
Abstract
Amazons is a fascinating game that shares properties of chess and Go. Designing a computer program that plays amazons on
the level of human experts and beyond is a real challenge. This article emphasizes the secret of such a program, viz. its evaluation
function. We describe the function by using explicit formulas, we mention the ideas and goals behind these formulas, we discuss
possible refinements, and study in detail methods for special endgame problems. By analyzing a tournament game of AMAZONG
against the threefold computer world champion 8QP we illustrate how the new features of our evaluation function can lead to victory.
© 2005 Elsevier B.V. All rights reserved.
Keywords: Amazons
1. Introduction
Amazons is a many-faceted game. The game set typically used to play Amazons is a draughts board of size 10 × 10,
four white and four black chess queens (called amazons), and a supply of go pieces of one colour (called arrows). The
starting position and a first move of white are shown in Fig. 1. A move consists of two steps: (1) the player chooses
an amazon of his colour and moves it like a chess queen diagonally, vertically, or horizontally as far as he prefers,
provided that no obstacle (another amazon or an arrow) blocks the way; (2) the amazon played has to throw an arrow.
Arrows also move like chess queens. They stay at their destination square for the rest of the game and are represented
by black squares in the other figures of this article. The players move alternately until one player can no longer move.
This happens after at most 92 moves. The player who makes the last move wins the game. A challenging problem
concerning the rules of amazons is as follows: Should white’s advantage of making the first move be compensated by a
komi and if so, by how many points? We will discuss two possible komi rules in more detail in Section 8 of this article.
I first heard about amazons at a workshop on combinatorial game theory at MSRI in July 2000. I was fascinated by
the deepness and subtlety of ‘simple’ positions in amazons that have been analysed by Berlekamp [1], Snatzke [12,13],
Müller and Tegos [11]. Inspired by discussions with Müller about his computer program ARROW and my experiences
of playing amazons I started to write the computer program AMAZONG. AMAZONG has won the amazons tournaments
at the seventh Computer Olympiad in Maastricht in 2002 and at the eighth Computer Olympiad in Graz in 2003. The
reader is invited to play against the Java applet AMAZONG at http://jenslieberum.de/amazong/amazong.html.
0304-3975/$ - see front matter © 2005 Elsevier B.V. All rights reserved.
doi:10.1016/j.tcs.2005.09.048
J. Lieberum / Theoretical Computer Science 349 (2005) 230 – 244 231
I described the general design of my program with a special focus on selective search in talks at the Universities of
Jena and Edmonton [6]. This article complements these talks and concentrates on AMAZONG’s evaluation function that
causes its characteristic style of play. This style clearly distinguishes it from other programs, and is probably its main
strength.
Like in many games, the game of amazons can be subdivided into different phases; they are opening, middle game,
and endgame. The opening in amazons is a grand challenge for computer programs owing to the absence of opening
theory, a branching factor of more than 1000, many positions with more than 20 reasonable moves, and the need for
calculating deep variations. Human play is still superior to computers in the opening. At the Computer Olympiad in
Graz, AMAZONG used a machine-generated opening book. Our algorithm for building this book combines ideas of
Lincke [8] with conspiracy numbers. However, the benefit of opening books is limited in amazons because of the huge
complexity of this game. Our evaluation function contains parts that are particularly useful in the opening. We will not
treat the automatic generation of opening books in this article.
It is difficult to say exactly when the endgame begins in amazons, but this phase should at least include the so-called
filling phase. The filling phase consists of those positions where each empty square on the board can be reached by
at most one player by some sequence of moves. In most games this happens after approximately 50 moves. We will
include problems that appear most frequently in or shortly before the filling phase in our investigation of the endgame.
The article is organized as follows. Sections 2–5 are dealing with the description of different parts of our evaluation
function, viz. parts that are most important in the opening and in the middle game. In Section 2 we describe the main
ingredients of the evaluation function that are based on king distances and queen distances and combine them into a
measure t for positional and territorial evaluation. Section 3 describes why t often does not evaluate positions correctly
where the mobility of a single amazon is low. We introduce a new ingredient m for our evaluation function that deals
with this problem. In Section 4 we analyse the behaviour of the separate parts of our evaluation function t + m during
a game against the computer program 8QP. We find some evidence for the assumption that 8QP lost because it did not
consider king distances and mobility. Section 5 contains a collection of ideas for further improvement of the evaluation
function.
Sections 6–9 are dedicated to the treatment of special endgame problems. Often, the outcome of the game is clear
when the filling phase starts. However, sometimes the positions can involve difficult filling puzzles. They are treated
theoretically and on a heuristic level in Section 6. A good solution of the filling puzzle is the basis of a successful
treatment of more complex endgame problems. In Section 7 we address a problem that typically appears earlier in the
endgame: we explain why amazons that guard a separate region of squares require special treatment in the context of
our evaluation function t + m. Sections 8 and 9 are dealing with zugzwang positions that would require impracticably
deep minimax search to be evaluated correctly by t + m. In Section 10 we conclude by reviewing briefly the major
problems and challenges of designing an evaluation function for the game of amazons.
232 J. Lieberum / Theoretical Computer Science 349 (2005) 230 – 244
The goal of the game amazons is to have access to more empty squares in the filling phase than the other player.
When player j (j ∈ {1, 2}, player 1 is white) has exclusive access to a region of n squares, we count these squares as
n secure points of the territory of player j. When both players can reach a square by some sequence of moves, it is
more complicated to guess which player will eventually shoot at that square. For this purpose AMAZONG uses heuristics
based on the following ways to measure distances on an amazons board.
Define the distance d1 (a, b) of two squares a and b as the minimal number of chess queen moves needed to go from
a to b. When there is no path, let d1 (a, b) = ∞. Similarly, define the distance d2 (a, b) as the minimal number of chess
king moves needed to go from a to b. Obviously, we have d1 (a, b) d2 (a, b). The distances of player j from square a
are then given by
j
Di (a) = min{di (a, b) | the square b is occupied by an amazon of player j }.
j j
On the left (respectively, right) side of Fig. 2 you find an example of D1 (a) (respectively, D2 (a)). In Fig. 2 the upper
left corners of empty squares contain the values Di1 (a) and the lower right corners contain the values Di2 (a).
j
All amazons programs seem to use D1 in one or another way (for example, see [5]). The idea behind the definition
j
of D1 is that D11 (a) < D12 (a) indicates that player one has better access to the square a than player two. One heuristic
for estimating the territory of player 1 is to assume that he will eventually shoot to all squares a with D11 (a) < D12 (a).
j
This heuristic works very well shortly before and in the filling phase. A problem of D1 at the beginning of the game
j
is that a single amazon of player j in the centre can cause low values of D1 on the whole board, but player j cannot
j j
move the amazon into all directions at once. Here D2 comes in. One advantage of D2 is its locality: often amazons
have to fulfill a certain task at their position like guarding the territory in their neighbourhood. Then a large value of
j
D2 (a) indicates that player j cannot move towards the square a without causing positional damage, despite a possibly
j j j
low value of D1 (a). Another advantage of D2 over D1 is that it is more stable when the other player moves and
j
shoots, especially when there are just a few arrows. This makes D2 useful for long term estimates and will stabilize
the evaluation function in the beginning of the game.
j
We use Di to assign local evaluations between −1 and 1 to each empty square. Positive values indicate an advantage
of player 1. Then we sum these numbers over all empty squares in order to transform the local evaluations into global
ones. One possible formula for global evaluations t1 , t2 is given by
ti = (Di1 (a), Di2 (a)),
empty squares a
j
Fig. 2. The minimal distances Di (a).
J. Lieberum / Theoretical Computer Science 349 (2005) 230 – 244 233
where
⎧
⎪
⎪ 0 if n = m = ∞,
⎨
if n = m < ∞,
(n, m) =
⎪
⎪ 1 if n < m,
⎩
−1 if n > m,
and −1 < < 1 is a constant with (−1)j 0 when it is player j’s turn. The number || estimates the advantage of
moving first when the distances of both players to an accessible square agree. We had good experience with || 15 , but
some fine-tuning is necessary after each modification of the evaluation function. We optimized the choice of in order
to obtain a low volatility of the evaluations during iterative deepening. This should help to avoid odd–even effects and
supports aspiration search with narrow – windows (see [10]).
A program that uses the territorial evaluation t1 as its evaluation function already plays quite reasonably, especially
shortly before the filling phase. In contrast to that we observed that the value t2 is useful in the beginning of the game
but becomes less significant as the game goes on. The evaluations ti do not take into account that large positive values
of Di2 (a) − Di1 (a) indicate a larger advantage for player 1 than small positive values only. This drawback of t1 and t2 is
more severe at the beginning than near the end of an amazons game. Therefore, besides , other local evaluations seem
to be important as well. The generic approach to find more good local evaluations is to use some array of parameters
instead of (n, m) and then to optimize these parameters. We had good experience with the choices
2−D1 (a) − 2−D1 (a) ,
1 2
c1 = 2
empty squares a
c2 = min(1, max(−1, (D22 (a) − D21 (a))/6)).
empty squares a
Notice that in c1 the local advantage (D11 (a), D12 (a)) = (1, 2) is rewarded by 0.5 points for player 1, (2, 3) by 0.25
points, (1, 3) by 0.75 points, and squares a with (D11 (a), D12 (a)) = (n, n) contribute 0 points. Other tuples are of minor
practical importance for c1 . In contrast to c1 , c2 depends only on D22 (a) − D21 (a) and only large differences indicate a
clear advantage of one player.
Now we have to combine the values ti and ci into one evaluation function. A weighted sum with static weights does
not seem to be appropriate for this because the importance of the values ti and ci varies during the game. Therefore,
we define
w = 2−|D1 (a)−D1 (a)| ,
1 2
where we sum over all empty squares a with D11 (a) < ∞ and D12 (a) < ∞. Obviously, we have w = 0 if and only if
the position belongs to the filling phase and typically w decreases with the number of moves played. These properties
of w are useful for our purposes. In particular, when the game approaches the filling phase it will be much better to use
w instead of other parameters such as the number of arrows on the board. Of course, the definition of w is by no means
canonic and leaves ample room for optimizations and new ideas. Continuing along these lines of thought we define an
evaluation t as
t = f1 (w)t1 + f2 (w)c1 + f3 (w)c2 + f4 (w)t2 ,
where (fi )i is a partition of 1 (meaning 0 fi (w) and i fi (w) = 1). The exact form of the functions fi is a problem
of parameter optimization. Our choice of f1 has been guided by the observation that t1 becomes increasingly important
during the game and gives quite good estimates of the expected territory shortly before the filling phase. Therefore, f1
is monotonously decreasing and satisfies f1 (0) = 1. The counterpart of t1 is t2 . It rewards balanced distributions of the
amazons of one player and helps to hinder the other player from reaching such a distribution. This is most important
at the beginning of the game. The values c1 and c2 allow detection of finer properties of the position than t1 and t2
alone, because they depend on the quality of local advantages. They support good positional play in the opening and a
smooth transition between the beginning and later phases of the game. This is most evident for t1 and c1 : while at the
end of the game only t1 counts, c1 rewards moves in earlier phases of the game that replace clear local disadvantages
by small disadvantages and small advantages by clear advantages.
234 J. Lieberum / Theoretical Computer Science 349 (2005) 230 – 244
AMAZONG tries to enclose amazons of the other player inside of small regions at the beginning of the game. Compared
to other computer programs, this is AMAZONG’s main strength. In this section we will present a modification of the
evaluation function t (see Section 2) that is responsible for this behaviour.
Enclosing amazons typically does not cause an appropriate change of t (and especially of t1 ) in the beginning of
the game. This can be explained as follows: when a single amazon A of player 1 is enclosed in some small region
of n points, then the amazons board is divided into two parts: the inside and the outside of that region. Player 1 has
exclusive access to the territory on the inside. This contributes n points to t. On the outside, some active amazons of
player 1 might overshadow the missing influence of A in D11 . In addition, some amazons of player 2 that have helped
to enclose A might not be in optimal positions but often have a large potential to improve their positions. The problem
that A cannot reach the outside for the rest of the game is not reflected in the computation of t. The disadvantage of the
enclosed amazon often starts to affect t several moves later. Then it is too late. Therefore, a correction term m is needed
to take into account the mobility of individual amazons. Since active amazons can overshadow bad positions of passive
amazons in the evaluation function t it seems more important to punish passive and enclosed amazons than to support
active amazons in this correction term. To compute m quickly, consider first the number N (a) of empty squares that
can be reached from a by a single move of a chess King. The numbers N (a) can be updated incrementally during the
search inside of functions doMove and undoMove. For an amazon A of player j on the square a, let
A = 2−d2 (a,b) N (b),
b
3−j
where we sum over all squares b with d1 (a, b) 1 and D1 (b) < ∞. When A = 0 we say that the amazon A is
enclosed. Examples of the values N (a), A , and of enclosed amazons are shown in Fig. 3. For example, for the white
amazon A in the upper left corner of Fig. 3 (left), we compute A = 7+6+5+3+3+(5+4 +7+4)/2 +5/4 = 35.25.
The two white amazons in the lower right corner of this figure are enclosed.
We have learned in discussions with experienced amazons players that at the beginning of a game on a board of size
10 × 10 enclosed amazons should be punished by a malus of at least 10 points. In general, we use w from the last
section to define
m= f (w, B ) − f (w, A )
amazons B amazons A
of player 2 of player 1
for a suitable function f 0. The exact choice of f is the hardest optimization problem in our evaluation function t + m,
so we restrict our description to the properties of f that did not change during our experiments: f satisfies f (0, y) = 0
In this section we compare our evaluation function t + m with t1 by using the game AMAZONG vs. 8QP played at
the 7th Computer Olympiad in Maastricht. The position after 26 moves in this game is shown in Fig. 2. AMAZONG
won the game by 8 points, mainly due to the enclosed black amazon in the upper left corner. Fig. 4 shows how t1
and the different components of t + m varied during the game. The values ti are computed using || = 0.1. The lines
corresponding to t1 , m and t + m are clearly visible. On move 13 white enclosed the black amazon which causes the
maximum of the dashed line corresponding to m. Notice that at this point the evaluation t + m predicts the outcome
of the game very well and differs from t1 by more than 18 points. After move 13 t1 and t + m become increasingly
related and finally coincide when the filling phase is reached.
As expected, the values c2 and t2 are more stable than c1 and t1 . In addition, c2 and t2 are positive in almost all
positions of the game. This indicates that the evaluation function of 8QP does not consider king move distances.
Therefore, 8QP puts up no resistance against AMAZONG maximizing these components of t + m.
5. Refinements
Below we consider positions with regions that are (almost) separated by arrows. An intriguing question is: how
much is it worth when one player has a majority of amazons inside such a region? Instead of looking for a general
236 J. Lieberum / Theoretical Computer Science 349 (2005) 230 – 244
answer to this difficult question, we simply observe that the territorial evaluation t has the tendency to underestimate the
advantage of the majority. Below we deal with five possible refinements or improvements. The first idea is that a possible
correction term of t could take into account the distances between each empty square and each amazon. However, the
j
computation of these values would take almost four times longer than the computation of Di (a). Therefore, it seems
j
more appropriate to compute only the numbers of amazons A of player j on squares b that satisfy di (a, b ) = Di (a).
j
These numbers can be computed efficiently together with Di (a). They are useful as additional inputs for refined
definitions of ci and ti . In addition to these corrections, the disadvantage of having a majority of amazons in a small
region early in the game should be reflected by m. This situation is not treated correctly by m because when amazons
of both players are inside one region the involved amazons are not considered as being enclosed.
A second refinement is as follows. In some experiments, we weighted squares in the computation of ci and ti . The
weights depended on w and the distance of the square from the centre of the board. It is difficult to assess the importance
of this second refinement.
A third idea for improvements is to repeat the constructions of Section 2 for other distance functions such as d1 + d2
or 2d1 +d2 (or estimates of these distances that can be computed more efficiently). One has to decide very carefully how
many different distance functions one should use, because each additional distance function slows down the evaluations
considerably.
A fourth refinement concerns the distribution of amazons on the board. In the opening it is desirable (especially for
black) to reach a position with exactly one amazon in each corner of the board. The distances from such a distribution
can be used to improve the evaluation function in the opening phase (see [7]).
Partially, the fourth refinement helped to fix the biggest weakness of our evaluation function, namely the under-
estimation of large territorial frameworks at the beginning of the game. In a fifth refinement, we deal with the same
problem by adding a bonus for huge areas of potential territory (see [9]).
Some positions in amazons endgames require a very deep search. Other endgame positions can be evaluated correctly
by humans without deep search but would require a 10-ply or even a 40-ply deep search by a program with the evaluation
function described so far. We will address the latter endgame problems in this article.
The simplest endgame positions arise when the board is completely decomposed, meaning that amazons of different
colours are separated by arrows. The game then consists of two puzzles: filling of white and black territory. Buro [3]
showed that the filling puzzle is NP-hard. In contrast, most positions that arise in actual games on a board of size
10 × 10 can be filled completely. Below we will describe a simple heuristic that often succeeds in filling the maximal
possible number of squares. For the cases where our heuristic fails or where complete filling is not possible, more
sophisticated methods are required. For these cases we will propose some ideas for a better heuristic solution of the
filling puzzle. Such a solution would be helpful in Sections 7 and 9. Moreover, it would help to find the right moment
to stop playing in a game of amazons without relying on human help and continuing with wrong assumptions. This
would make amazons programs and amazons game servers more user friendly.
Typically, we assume that a connected region of n empty squares containing amazons of one player p provides a
resource of n moves for the player p. If the connected region only provides a resource of m < n moves, then we call
this region defective territory and call the number n − m the defect of the region. A simple way that helps to avoid the
creation of defective territory is to add to the usual evaluation function a term s that measures the shape of the territory.
Starting to avoid the creation of defects in the filling phase might be too late. Therefore, one starts to compute s when w
(see Section 2) drops below some threshold value. In order to avoid interference of s with more important ingredients
of our evaluation function, we choose s to be very small. The definition of s is based on two heuristics. First, we assume
that it is typically simpler to fill a region with more than one amazon than with a single amazon. This observation
leads us to the following definitions: we define M1 (resp. M2 ) as the set of all squares a such that the queen distance
of player one from a is smaller (respectively, larger) than the queen distance of player two from a, or the square a is
occupied by a white (respectively, black) amazon. Second, let s1 be the number of connected components of M2 minus
the number of connected components of M1 , where we consider squares of king distance one as connected. Using the
J. Lieberum / Theoretical Computer Science 349 (2005) 230 – 244 237
The heuristic value s = s2 /10 000 alone also leads to good results in many cases. The idea behind the definition of
s2 is that high local connectivity helps to avoid defects. The following theorem gives some justification of this second
heuristic behind the definition of s.
Theorem 1. If a region R can be built up step by step by starting with a single 2 × 2-block and by adding 2 × 2-blocks
that overlap with the preceding region, then R can completely be filled by a single amazon starting on some square
of R.
Sketch of Proof. We will prove the stronger statement that the amazon can traverse each square of the region once
and then end up on her starting square by making king moves. This will imply that for any starting square the amazon
can follow a circular path and always shoot to the square where she came from, thereby filling the whole region. This
will prove the theorem. In order to simplify the proof of the stronger statement we also require that the circular path
can be drawn on the board without self-intersections.
The stronger statement is now proved by induction on the number k of 2 × 2-blocks. The case k = 1 is trivial. We
illustrate the induction step by examples: the left side of Fig. 5 shows a path that goes through two squares of a new
2 × 2-block. The directions of the lines in the picture inform us about the neighbouring squares on the outside of the
new block that the path traverses before entering and after leaving the block. The right side of Fig. 5 shows how the
path can be extended to the whole block without modifying the path on the outside.
In some cases the old path cannot simply be extended to the new 2 × 2-block. In these cases, the old path also has
to be modified on the outside of the new block. An example is shown in Fig. 6: on the left side of Fig. 6, the squares
where the path enters and leaves on the lower left side are neighbours. Therefore, it is possible to find a shortcut on the
outside as shown in the middle of Fig. 6 2 . Then the path can be extended to the new block as before.
It is an easy (but long) exercise to extend Figs. 5 and 6 to a complete list of cases.
2 Here our assumption that the path has no self-intersections is important to ensure that this is possible in all remaining cases.
238 J. Lieberum / Theoretical Computer Science 349 (2005) 230 – 244
An example of a region that can be filled completely according to Theorem 1 is shown in Fig. 7.
Regions with large defects are rare, but they exist. The following theorem implies the existence of regions with a
defect that is larger than 99.999% of the size of the region.
√ numbers kn such that the largest defect for connected regions with n squares and one amazon is
Theorem 2. Define
equal to n − kn n. Then there exist a, b 21 such that kn ∈ [a, b] for all n.
Proof. For a region containing k × k vertices in the planar grid we construct a trivalent graph Gk that has the shape of
a comb as shown in Fig. 8 and that goes through all vertices of the region. The longest path Lk inside Gk has length
3k − 3.
Then we transform the graph Gk into a region on an amazons board by following a procedure of Buro 3 : in the
planar grid we replace each vertex and its adjacent half-edges by a certain region of squares on an amazons board of
size 9 × 9. For the 9 vertices of the graph G3 we show the corresponding regions in Fig. 9 (compare Figs. 9 and 10 of
[3]). By sticking together the regions of size 9 × 9, Gk is transformed into a region of squares Rk on an amazons board
of size 9k × 9k consisting of
rk = 5k + 10(k − 2)k + 14(k − 2) + 16 = 10k 2 − k + 12
squares (k > 1). We define R1 to be a single square.
3 More precisely, our description is equivalent to Buro’s transformation with corridor length 2.
J. Lieberum / Theoretical Computer Science 349 (2005) 230 – 244 239
The regions from Fig. 9 have the property that an amazon that traverses a region has to block a distinguished central
square with the consequence that each region can be traversed at most once. Therefore, it is easy to see in our case that
the image Fk of the longest path Lk under Buro’s transformation is the maximal region of squares of Rk that can be
filled by a single amazon inside of Rk . 4 For k > 1 the path Fk consists of fk = 10 + 30(k − 2) + 16 = 30k − 34
squares.
For all n with rk n < rk+1 we simply add squares to Rk (say at the lower right corner) in a way that we obtain
regions Qn ⊃ Rk of size n such that (Qn \Rk )∪Fk is the largest region of Qn that can √
be filled by a single amazon. This
√
largest region contains at most pn = fk + rk+1 −√ rk − 1 squares. We have pn = ( n) so, in particular, pn < b n
for some constant b and for all n 1. Since n − b n is a lower bound for the largest possible defect, the number b is
an upper bound for all kn defined in the theorem.
In addition, for every square of a connected
√ region of size n on an amazons board we can find a path that starts at
the given square and contains at least 21 n squares. This implies the existence of the lower bound a for the sequence
(kn )n .
The smallest regions with defects are shown in Fig. 10. In all three cases, white has to throw an arrow to the square
in the center in his next move, so white can only fill one square.
A simple solution of the filling problem uses s from above together with some knowledge about small defective
territories. This solution is easy to implement and of high practical value. For an implementation of an endgame
database including more knowledge about defective territories than shown in Fig. 10, a data structure called line
segment graph is useful (see [11]).
For larger regions endgame databases can only contribute shortcuts near the end of the search and local properties
are not sufficient to obtain good heuristics for filling. For example, consider a connected region R = R1 ∪ R2 with
disjoint connected parts R1 and R2 . If region R2 has the property that once we enter R2 we can no longer come back
to R1 , it makes sense to fill as much of R1 as possible before going to R2 . But if we can come back from R2 to R1 a
completely different plan might be appropriate. Therefore, we believe that besides local connectivity heuristics a global
plan that guides the heuristic search is important. A ‘global plan’ might mean to specify for each square how urgent
4 Buro used long corridors to enforce this property for arbitrary trivalent graphs in the planar grid. This is not necessary in our case.
240 J. Lieberum / Theoretical Computer Science 349 (2005) 230 – 244
it is to fill it during heuristic search. One way to assign these urgency values uses the results of the search so far: one
first determines the ‘difficulty of a square’ depending on the number of times the square under consideration could
be filled during preceding searches and then determines (for example) randomly if certain regions of difficult squares
should be filled with high or with low urgency. In this context, Monte-Carlo simulations seem to be a good choice for
the heuristic search, because the expense of this method can nicely be scaled. This is helpful for trying out different
global plans and for implementing a smooth transition between minimax search and single agent search.
The method above finds a lower bound for the number of squares that can be filled. Often this lower bound is equal
to the size of the region and we are done. General methods to find good upper bounds for the filling puzzle would be
very useful. The investigation of so-called dead ends in [11] has been a first step in this direction.
7. Guards
In this section we propose an improvement of our evaluation function in positions where amazons guard certain
pieces of territory. Such amazons can already appear very early in a game, but it is almost certain that they will appear
in the endgame some moves before the filling phase starts.
Consider amazon A on the square c6 in Fig. 11. She has exclusive access to 10 points of territory in the upper left
corner. In addition, she has direct outside influence on the squares c5, c4, c3, c2, c1, d5, e4, d6, e6, f6, g6, h6, and i6.
The problem with the position in Fig. 11 is that amazon A has to give up the upper left corner if she wants to make
use of her outside influence by moving to one of these squares. Therefore, the part t1 of our evaluation function (see
Section 2) tends to overestimate the outside influence of amazon A. In addition, it would not be appropriate if one would
completely neglect the outside influence of A. This influence restricts the mobility of the white amazons on d7 and d3
in Fig. 11. The amazon on d7 threatens to move to c6 when amazon A leaves. The amazon on d3 has to pay attention
to the threat that A moves to the lower left corner and shoots to c6.
The preceding observations lead us to the following definition: we say that an amazon is a guard (or n-guard), if
she has exclusive access to n > 0 squares of territory. For an n-guard A of colour c we count the number nA of empty
squares a such that A is the only amazon of colour c in queen distance one of a and the opponent is in queen distance
one of a, too. We define gA = min(n, nA /2) and correct our evaluation function by subtracting gA for all white guards
A and by adding gA for all black guards A. For example, in Fig. 11 we have gA = 3.5 for the guard A on c6. In this
example, the seven squares c2, c3, c4, d5, e4, d6, e6 contribute to nA . We note that the square i6 does not contribute to
nA because the black amazon on h7 is also in queen distance one of i6.
We do not require that a guard cannot enlarge its territory by some move. Partially, our evaluation of guards takes
care of this problem. In addition, problems of this kind are easily resolved by search. Also notice that two amazons of
player p that have access to the same secure territory are not guards in our sense because we require exclusive access
of guards to some territory. We insist on exclusive access because a position where player p can choose which amazon
should leave a territory typically does not cause a big loss of outside influence at all.
The next section will be concerned with a treatment of zugzwang positions. The concept of the outside influence of
guards and a good solution of the filling puzzle are important in a heuristic treatment of these positions. The evaluation
of the outside influence of guards that are involved in zugzwang positions will slightly differ from this section.
J. Lieberum / Theoretical Computer Science 349 (2005) 230 – 244 241
Below we discuss the following interesting question. Who wins in the position shown in Fig. 12 when it is black’s
turn and the game is played with a komi of two points?
The answer to that question depends on subtle details of the komi rule. Let us consider two versions of giving n komi
points to black.
1. Black may pass up to n times instead of making a move.
2. The game is played until one player cannot move. White wins in the case where black cannot move and white can
make at least n more moves. Otherwise black wins.
Under most circumstances, rules 1 and 2 lead to the same outcome of a game. Positions involving zugzwang are an
exception. Using rule 1 in Fig. 12, black passes and wins. Using rule 2, black has to make the first move and loses.
242 J. Lieberum / Theoretical Computer Science 349 (2005) 230 – 244
An implementation of komi rule 2 requires a solution of the filling puzzle. Rule 1 is simpler, looks more natural,
and is easy to implement. An implementation of komi rule 1 does not require a modification of the evaluation function.
Since a komi of n points in the sense of rule 1 is equivalent to an additional black amazon inside of an isolated piece
of territory of n points, we can think of a komi in this sense as a part of the position.
The position in Fig. 12 will be evaluated as a small advantage for the player to move by our evaluation function. This
is wrong, but it is only slightly wrong. We are looking for zugzwang positions where our actual evaluation function
goes wrong completely because in these cases a larger improvement of our evaluation function is possible. In addition,
we want to restrict attention to positions that often show up in games and where a heuristic evaluation does not involve
too much search. A representative example of such a position is shown on the left side of Fig. 13. Our evaluation
function assigns an advantage of 4 points for white to this position. The reader should verify that the player to move in
the position loses. Therefore, 0 would be a better evaluation of the position.
Now we will define generic zugzwang positions by using the picture on the right side of Fig. 13. We think of R1
(resp. R2 , R3 , R4 ) as regions of empty squares, where regions R1 , R2 are adjacent to a white amazon and regions
R3 , R4 are adjacent to a black amazon. It is not necessary that the two amazons are diagonally adjacent to the regions
Ri and some regions may be empty. Denote the square with the white (respectively, black) amazon by W (respectively,
B). In difference to the picture, we require in our definition only that the squares W and B are in king distance 1. Let a
(respectively, b) be the number of moves provided by the region R1 ∪ W ∪ R2 (resp. R3 ∪ B ∪ R4 ). We may assume
that a b. Then we say that the position is a generic zugzwang position if the following conditions are satisfied:
• the white (respectively, black) amazon has exclusive access to the separate regions R1 and R2 (respectively, R3 and
R4 ),
• R1 = ∅ and R2 = ∅,
• for all best white (respectively, black) moves in the position the arrow is thrown to W (respectively, B).
We now consider the examples of zugzwang positions in Fig. 14 by going from the left side to the right side in this
figure. The first two positions are generic zugzwang positions. For the third and fourth position, the last condition from
our list does not hold true. In both cases, the black amazon can go one step downwards and throw an arrow to the lowest
square of the picture. Then the white amazon must not move to the square B because by throwing an arrow she would
block her way back. Therefore, the third and fourth position in Fig. 14 are not generic zugzwang positions.
Consider a generic zugzwang position with a b as above. Define ci (i = 1, 2) as the maximal number of moves
provided by the region Ri with a single amazon on some square of Ri in queen distance 1 of W. Then a good evaluation
of the generic zugzwang position is given by
v = max{c1 + 1 − b, c2 + 1 − b, 0}.
For example, we have v = 2 for the first position in Fig. 14 and v = 1 for the second position in this figure. The
evaluation v coincides with the value of the position in combinatorial game theory [4,2]. When it is important to win by
a certain margin, or equivalently, when komi rule 2 of the previous section is used, then the position is more complicated
and an analysis of sums of generic zugzwang positions is equivalent to the knapsack problem (see [11]).
It is interesting to see how good the evaluation v is in the case of the two positions in Fig. 14 that are not generic
zugzwang positions: for the third position in this figure, v = 1 gives the correct value of combinatorial game theory,
but this position requires a treatment different from the generic zugzwang positions when komi rule 2 is used. In the
fourth position, we have v = 0, but −1 is the correct value.
In practice, we are already satisfied when the evaluation v is typically better than an evaluation that does not consider
zugzwang. This allows to loosen some conditions or to replace them by heuristics. For example, we may assume that
no defective territory is involved. In other words, let ai be the number of squares in Ri . Then we assume a = a1 + a2 ,
b = a3 + a4 , c1 = a1 − 1, and c2 = a2 − 1. We did not successfully use weaker versions of the conditions for R1 and
R2 in the definition of a generic zugzwang position. We had good experience with including positions where amazons
have outside influence. For example, if the white amazon has outside influence, then we regard the white amazon as
a max(a1 , a2 )-guard and evaluate the outside influence according to Section 7. The black amazon may be inside of
some large region together with other black amazons in this example. We treat this case as R3 = ∅ = R4 , make no
adjustments to the evaluation of the influence of the black amazon, but subtract the value min(a1 , a2 ) from our usual
evaluation because it is likely that the situation ends up in zugzwang.
10. Conclusion
We have described all components of the evaluation function of our program AMAZONG. An important idea in our
evaluation function is the smooth transition between many different strategic goals at the beginning of the game and
a purely territorial evaluation in the filling phase. The most difficult problems are to find (1) a good balance between
mobility and territory and (2) a realistic evaluation of the advantage of local majorities of amazons. In the endgame,
a variety of new problems appear that require special treatment. Partially, these problems can be solved by endgame
databases. For the remaining positions that involve zugzwang or difficult filling puzzles, global minimax search is
not very helpful, but these problems often can be solved by some restricted usage of search inside of the evaluation
function.
References
[1] E.R. Berlekamp, Sums of 2 × N amazons, in: F.T. Bruss, L. le Cam (Eds.), Game Theory, Optimal Stopping, Probability and Statistics: Papers
in honor of Thomas S. Ferguson, Institute of Mathematical Statistics, Lecture Notes—Monograph Series, Vol. 35, 2000, pp. 1–34.
[2] E.R. Berlekamp, J. Conway, R. Guy, Winning Ways, Academic Press, London, 1982.
[3] M. Buro, Simple amazons endgames and their connection to Hamilton circuits in cubic subgrid graphs, in: T. Marshland, I. Frank (Eds.). Proc.
Second Internat. Conf. Computers and Games, CG00, Vol. 2063, Lecture Notes in Computer Science, 2001, pp. 250–261.
[4] J. Conway, On Numbers and Games, Academic Press, New York, 1976.
[5] T. Hashimoto, Y. Kajihara, N. Sasaki, H. Iida, J. Yoshimura, An evaluation function for amazons, in: H.J. van den Herik, B. Monien (Eds.),
Advances in Computer Games, Vol. 9, Universiteit Maastricht, The Netherlands.
244 J. Lieberum / Theoretical Computer Science 349 (2005) 230 – 244