An Analysis of Alpha-Beta Pruning
An Analysis of Alpha-Beta Pruning
293
Recommended by U. Montanari
ABSTRACT
The alpha-beta technique for searching game trees is analyzed, in an attempt to provide some
insight into its behavior. The first portion o f this paper is an expository presentation o f the
method together with a proof o f its correctness and a historical ch'scussion. The alpha-beta
procedure is shown to be optimal in a certain sense, and bounds are obtained for its running
time with various kinds o f random data.
Computer programs for playing games like e,hess typically choos~ their
moves by seaxching a large tree of potential continuations. A technique
called "alpha-beta pruning" is generally ~used to speed up such search
processes without loss of information, The purpose of this paper is to
analyze the alpha-beta procedure in order to obtain some quantitative
estimates of its performance characteristics.
i This research was supported in part by the National Science Foundation under grant
number GJ 36473X and by the Officeof Naval Research under contract NR 044-402.
Artificial Intelligence6 (1975), 293-326
Copyright 1975 by North-Holland Publishing Company
294
Section 1 defines the basic concepts associated with game trees. Section 2
presents the alpha-beta method together with a related technique which is
similar, but not as powerful, because it fails to make "deep cutoffs". The
correctness of both methods is demonstrated, and Section 3 gives examples
and further development of the algorithms. Several suggestions for applying
the method in practice appear in Section 4, and the history of alpha-beta
pruning is discussed in Section 5.
Section 6 begins the quantitative analysis, byderiving lower bounds on
the amount of searching needed by alpha-beta and by any algorithm which
solves the same general problem. Section 7 derives upper bounds, primarily by
considering the case of random trees when no deep cutoffs are made. It is
shown that the procedure is reasonably efficient even under these weak
assumptions. Section 8 shows how to introduce some of the deep cutoffs into
the analysis; and Section 9 shows that the efficiencyimproves when there are
dependencies between successive moves. This paper is essentially selfcontained, except for a few mathematical resultsquoted i n the later sections.
1. Games and Position Values
The two-person games we are dealing with can be characterized by a set of
"positions", and by a set of rules for moving from one position to ~,nother,
the players moving alternately. We assume that no infinite sequence of
positions is allowed by the rules, 2 and that there are only finitely many legal
moves from every position. It follows from the "infinity lemma" (see [11,
Section 2.3,4.3]) that for every position p there is a number N(p) such that no
game starting a t p lasts longer than N(p) moves.
I f p is a position from which there are no legal moves, there is an integervalued function f(p) which represents the value of this position to the player
whose turn it is to play from p; the value to the other player is assuraed to be
- --f(p).
If p is a position from which there are d legal moves Pl, ., Pd, where
d > 1, the problem is to choose the "best" move. We assume that the best
move is one which achieves the greatest possible value when the game ends,
if the opponent also chooses moves which are best for h i m . Let F(p) be the
greatest possible value achievable from position p against the optimal
defensive strategy, from the standpoint of the player who is moving from that
2 Strictly speaking, chess does not satisfy this condition, since its rules for repeated
positions only give the players the option to request a draw; in certain circumstances;
if neither player actually does ask for a draw,: the game can go o n forever. But this technicality is of no practical importance, since computer chess programs only look finitely many
moves ahead. I r i s possible to deal with infinite games by assigning appropriate values to
repeated positions, but such questionsare beyond the scope of this paper.
Artificial Intelligence 6 f1975), 293-326
295
position. Since the value.(to this player) after moving to position Pt will be
-F(pl), we have
~f(p)
if d = 0,
(1)
F ( p ) = (max(-F(pl),...,
if d > 0.
This formula serves to define F(p) for al! positions p, by induction on the
length of the longest game playable from p.
In most discussions of game-playing, a slightly different formalism is used;
the two players are named Max and Min, where all values are given from
Max's viewpoint. Thus, if p is a terminal position with Max to move, its
value is f(p) as before, but if p is a terminal position with Min to move its
value is
gO') = -f(P).
(2)
Max will try to maximize the final value, and Min will try to minimize it.
There are now two functions corresponding to(l), namely
F(p) V = ~f(P)
if d = 0,
[max(G(pl),..., G(pd)) if d > 0,
(3)
which is the best value Max can guarantee starting at position p, and
fg(p)
if d = 0,
G(p) = [.min(F(pl),..., F(Pd)) if d > 0,
(4)
which is the best that Min can be sure of achieving. As before, we assume
that Pl,. ., Pa are the legal moves from position p. It is easy to prove by
induction that the two definitions of F in (1) and (3) are identical, and that
ffi - F 0 , )
(5)
296
rather subtle winning move. We may be better off gambling o n the move to
P2, which is our only chance to win, unless we are convinced of our opponent's
competence. Indeed, humans seem to beat chess-playing programs by adopting
such a strategy.
2. Development of the Algorithm
:= t;
end;
F :-" m;
end;
end.
Here Qo denotes a value that is greater than or equal to ]f(P)l for all terminal
positions of the game, hence - u3 is less than or equal to +F(p) for all p.
This algorithm is a "brute force" search through all possible continuations;
the infinity lemma assures us that the algorithm will terminate in finitely
many steps.
It is possible to improve on the brute-force search by using a "branch-andbound" technique [14], ignoring moves which are incapable of being better
than moves which are already known. For example, i f F(pi) = -10, then
F(p) >i 10, and we don't have to know the exact Value ofF(p2) if we can
deduce that F(p2) >I - 10 (i.e., that -F(pz) ~ 10). Thus if P~t is a legal
move from P2 such that F(Pzl)( <~ 10, w e n e e d n o t bother to explore any
other moves from Pz. In game-playing terminology, a move to Pz can be
"refuted" (relative tothe alternative move Pt) ff the opposing player can make
a reply to Pz that is at least as good as his best reply to Pl- Once a move has
been refutedi we need not search for the best possible refutation.
This line of reasoning leads to a computational technique that avoids much
of the computation done by F. We shall define FI as a procedure on two
parameters p and bound, and our goal is to achieve the following Conditions:
(6)
297
These relations do not fully define F1, but they are sufficiently powerful to
calculate F(p) for any starting position p because they imply that
Fl(p, oo) = F(p).
(7)
The following algorithm corresponds to this branch-and-bound idea.
integer procedure FI (positioRp, integer bound):
begin integer m, i, t, d;
determine the successor positions Pl,. ., Pj"
if d = 0 then F l : =
f(p) else
begin m :-- - o o ;
for i := 1 step 1 nntil d do
begin t := - F l ( p t , : m ) ;
ift>mthenm:=
t;
We can prove that this procedure satisfies (6) by arguing as follows: At the
beginning of the tth iteration of the for loop, we have the "invariant"
condition
m = max(-F(pl),...,-F(pH))
(8)
just as in procedure F. (The max operation over an empty set is conventionally
defined to be -oo.) For if ,F(pt) is >m, then F l ( p i , - m ) = F(p~), by
condition (6) and induction on the length of the game following p; therefore
(8) will hold on the next iteration. And i f m a x ( - F ( p l ) , . . . , -F(pi)) >~bound
for any i, then F(p) >t bound. It follows that condition (6) holds for all p.
The procedure can be improved further if we introduce both lower and
uppe.r bounds; this idea, which is called alpha-beta pruning, is a significant
extension to the one-sided branch-and-bound method. (Unfortunately it
doesn't apply to all branch-and-bound algorithms, it works only when a
game tree is being explored.) We define a procedure F2 of three parameters p,
alpha, and beta, for alpha < beta, satisfying the following conditions
analogous to (6):
F2(p, alpha, beta) <~ alpha
if F(p) ~ alpha,
F2(p, alpha, beta) - F(p)
if alpha < F(p) < beta, "(9)
F2(p, alpha, beta) >~ beta
if F(p) >/ beta.
Again, these conditions do not fully specify F2, but they imply that
F2(p, - oo, oo) - F(p).
( IO)
It turns out that thisimproved algorithm looks only a littledifferentfrom the
others, when it is expressed in a programming language:
Artificial Intelligence 6 (1975), 293-326
298
begin m := alpha;
for i : - 1 step 1 until d do
begin t : = - F2(pl, -- beta, - m ) ;
if t > m then m : = t;
if m >1 beta then go to done;
end;
done: F2 : = m;
end;
end;
To prove the validity of F2, we proceed as we did with FI. The invariant
relation analogous to (8) is now
m = m a x ( a l p h a , - F ( P i ) , . . . , -F(p~-I))
(11)
and m < beta. If --F(p~ >i beta, then -F2(pl, - b e t a , - m ) will also be
>~beta, and i f m < - F ( p i ) < beta, then - F 2 ( p t , - b e t a , - m ) = - F ( p t ) ; so
the proof goes through as before, establishing (9) by induction.
Now that we have found two improvements of the minimax procedure,
it is natural to ask whether still further improvement is possible. Is there an
"alpha-beta-gamma" p~'ocedure F3, which makes use say of the secondlargest value found so far, or some other gimmick ? Section 6 below shows
that the answer is no, or at least that there is a reasonable sense in which
procedure F2 is optimum.
3. Examples and Refinements
As an example of these procedures, consider thetree in Fig, 1, which represents a position that has three successors, each o f which has three successors,
etc., until we get t o 34 = 81 positions~possible after four moves; and these
81 positions have been assigned ',random',fvalues according to the first I]1
digits of n. Fig, 1 shows the F values computed from t h e f ' s ; thus, the root
node at the top of the tree has an effective value of 2 after best play by both
sides.
Fig. 2 shows t h e same situation as it is evaluated by procedure FI Cp, oo).
Note that only 36 of the 81 terminal positions are examined, and that one
of the nodes at level 2 now has the "approximate" vaiue 3 instead of its true
value 7; but this approximation does not of course affect the value at the top.
Fig: 3 shows the same situation as it is evaluated by the full alpha-beta
pruning procedure. F2(p, - o o , + oo) will always examine the same nodes as
Fl(p, oo) until the fourth level of lookaheadis reached, in any game tree;
Artificial lntell~ence 6 (1975), 293-326
299
11/\ /iX
/iX
//X
/iX
-1 -1-2-3-7-2-4-2-3-2-0-2-1
0 0 0 0 0
0 0 0 0 0 0 0
/N
IN
/~\
/~\
-1 -3 -3-0-2-0-4-4-0-1-0-2-0-8
0 0 0 0 0 0 0 0 0 0 0
2
-2
/e,.,.
.i./\
_~/!
-1-1-2-3
/t
,,
\..,.,.
i!
-4
i ~
-2-o-2
/~, ~',~,,~,,,,,,~,,,,,
3141 263358
846
3279502
"-
,.,
~'h,;:~ ~ ~ 0974944
-o-l-o
230781640
~'
F=G. 2. Tit=' ::,une tree of Fig. 1 evaluated with procedure FI (branch-and-bound strategy).
..1t/\
2/
--1 . 1 - 2 - 2
r t
O O O O | ~
,.'":"-.
\ 2
-2
0~%
~.
I
I\
.2-0-2
OOO
/\"'-
4 /
,-,,,
;~',
;,",',
I~\
,
~\
/ s i: ~
e ' ~ ,-0-4 -4-2-1-0
t
:\
o o e o O o
,,,
i* t ,
~ , ,
3141
265358
846
329502
,1~ I I , , i t
781640
]FIG. 3. The game tree of Fig. 1 evaluated with procedure F2 (~dpha-beta strategy).
300
AN ANALYSISOF ALPHA-BETAPRUNING
301
if q = A then F2 : = f ( p ) else
begin m : = alpha;
while q ~ A and m < beta do
begin t : - - F 2 ( q , - b e t a , - m ) ;
i f t > m then m := t;
q : - next(q);
end;
F2 : = m ;
end;
end.
It is interesting to convert this recursive procedure to an iterative (nonrecursive) form by a sequence of mechanical transformations, and to apply
simple optimizations which preserve program correctness (see [13]). The
resulting procedure is surprisingly simple, but not as easy to prove correct as
the recursive form:
integer procedure alphabeta (gel (position) p);
begin integer I; .omment level of recursion;
integer array a [ - 2 : L ] ; comment stack for recursion, where
all - 2], a[! - 1], all], all + 1] denote respectively
alpha, - beta, m, - t in procedure F2;
ref (position) array r[0:L + 1]; comment another stack for
recursion, where rill and r[l + 1] denote respectively
p and q in F2;
1 : = 0; a [ - 2 ] : = a [ - l ] : - - o o ; r[0] : = p ;
1:2: generate (rill);
r [ / + 1] :-first(rill);
if r[l + 1] = A then a[l] : = f(r[l]) else
begin a[l] : = a [ / - 2];
loop: 1 : = 1 + I; go to F2;
resume: if - a l l + 1 ] > all] then
begin all] := - a l l + 1];
if all + 1] ~ a[! - 1] then go to done;
end;
r[I + 1] : = next(r[1 + 1]);
if r[l + 1] :P A then go to loop;
end;
done: l : = I - 1; if 1 t> 0 then go to resume;
alphabeta : - a[0];
end.
This procedure alphabeta(p) will compute the same value as F2(p, - oo, + oo);
we must choose L large enough so that the level of recursion never exceeds L.
Artifldal Intelligence 6 (197~, 293-326
302
4. Applications
When a computer is playing a complex game, it will rarely be able to :;earch all
possibilities until truly terminal positions are reached; even the alpha-beta
technique won't be last enough to solve the game of chess:! But we can still
use the above procedures, if the routine that generates all moves is modified
so that sufficientlydeep positions are considered to be terminal. For example,
if we wish to look six moves ahead (three for each player), we can pretend
that the positions reached at level 6 have no successors. To compute f at
such artificially-terminal positions, we must of course use our best guess
about the value, hoping that a sufficiently deep search will ameliorate the
inaccuracy of our guess. (Most of the time will be spent in evaluating these
guessed values for f , unless the determination of legal moves is especially
difficult, so some quickly-computed estimate is needed.)
Instead of searching to a fixed depth, it is also possible to carry some lines
further, e.g., to play out all sequences of captures. An interesting approach
was suggested by Floyd in 1965 [6]), although it has apparently not yet been
tried in large-scale experiments. Each move in Floyd's scheme is assigned a
"likelihood" according to the following general plan: A forced move has
"likelihood" of 1, while very implausible moves (like queen sacrifices in
chess) get 0.01 or so. In chess a "recapture" has "likelihood" greater than ~;
and the best strategic choice out of 20 or 30 possibilities gets a "likelihood"
of about 0.1, while the worst choices get say 0.02. When the product of all
"likelihoods" leading to a position becomes less than a given threshold
(say 10-s), we consider that position to be terminal and estimate its value
without further searching. Under this scheme, the "most likely" branches of
the tree are given the most attention.
Whatever method is used to produce a tree of reasonable size, the alphabeta procedure can be somewhat improved if we have an idea what the value
of the initial position will be. Instead of calling F 2 ~ , , o0, .+ ~), we can
try F2(p, a, b) where we expect the value to be greater than a and less than b.
For example, if F2(p, 0, 4) is used instead of F2(p, - 1 0 , +10) in Fig. 3, the
rightmost " - 4 " on level 3, and t h e " 4 " below it, do not need to be considered. If our expectation is fulfilled, we may have pruned off more of the
tree; on theother hand if the value turns out to be low, say F2(p,a, b) ffi v,
where v ~< a, we can use F2(p, - co, v)to deduce thecorrect value. This idea
has been used in some versions of Greenblatt's chess program [8].
5. History
--
303
some people have confused procedure F1 with the stronger procedure F2;
therefore the following account is based on the best information now available to the authors.
McCarthy [15] thought of the method during the Dartmouth Summer
Research Conference on Artificial Intelligence in 1956, when Bernstein
described an early chess program [3] which didn't use any sort of alpha-beta.
McCarthy "criticized it on the spot for this [reason], but Bernstein was not
convinced. No formal specification of the algorithm was given at that time."
It is plausible that McCarthy's remarks at that conference led to the use of
alpha-beta pruning in game-playing programs of the .late 1950s. Samuel has
stated that the idea was present in his checker-playing programs, but he did
not allude to it in his classic article [21] because he felt that the other aspects
of his program were more significant.
The first published discussion of a method for game tree pruning appeared
in Newell, Shaw and Simon's description [16] of their early chess program.
However, they illustrate only the "one-sided" technique used in procedure
F1 above, so it is not clear whether they made use of "deep cutoffs".
McCarthy coined the name "alpha-beta" when he first wrote a LISp
program embodying the technique. His original approach was somewhat
more elaborate than the method described above, since he assumed the
existence of two functicns "optimistic value(p)" and "'pessimistic value(p)'"
which were to be upper and lower bounds on the value of a position.
McCarthy's form o f alpha-beta searching was equivalent to replacing the
abort: body of procedure F2 by
if optimistic value(p) <~ alpha then F2 : = alpha
else ifpessimistic value(p) >>.beta then F2 := beta
else begin <the above body of procedure F2) end.
Because of this elaboration, he thought of alpha-beta as a (possibly inaccurate) heuristic device, not realizing that it would also produce the same
value as full minimaxing in the special case that optimistic value(p) = + oo
and pessimistic value(p) = - c o for all p. He credits the latter discovery to
Hart and Edwards, who wrote a memorandum [10] on the subject in 1961.
Their unpublished memorandum gives examples of the general method,
including deep cutoffs; but (as usual in t961) no attempt was made to
indicate why the method worked, much less to demonstrate its validity.
The first published account of alpha-beta pruning actually appeared in
Russia, quite independently of the American work. Brudno, who wasone of the
developers of ai~ early Russian chess-playing program, described an algorithm
identical tO alpha-beta pruning, together with a rather complicated proof, in
1963 (see [4]).
The fall alpha-beta pruning technique finally appeared in "Western"
Artificial lnteUlgenc 6 (1975), 293-326
304
one of the authors of the present Paper 0D.E.K.) did some of the research
AN ANALYSIS OF ALPHA-BETA P R U N I N G
305
. ,
23
306
This corollary was first derived by Levin in 1961, but no proof was
apparently ever written down at the time. In fact, the informal memo [I0]
by H a r t and Edwards justifies the result by saying: ,'For a convincing
personal proof using the new heuristic hand waving technique, see the
author of this theorem. '~ A proof was later published b y Slagle and Dixon
[25]. However, none of these authors pointed out that t h e valueof the root
position must not equal + oo. Although this is a rare occurrence innontrivial
games, since it means that the root position is a forced win or loss, it is a
necessary hypothesis for both the theorem and the corollary, since the
number of positions examined on level I will be d t|/2J when the root value is
+co, and it will be d r~/21when the root value is - c o . Roughly speaking,
'
we
gain a factor of 2 when the root value is ~ oo.
The characterization of perfect alpha-beta pruning in terms of critical
positions allows us to extend Corollary 1 to a much more general class of
game trees, having any desired probability distribution of legal moves on
each level
COROLLARY 2. Let a random game tree be generated in such a way that each
307
= dj. The number of successor positions for p is chosen at random from this
distribution, independently of the number of successors of other positions on
level j.)
2 if"
-2
"~3
-1
-I
,,
-2
Thus the truly optimum order of game trees traversal isn't obvious. On the
other hand it is po~ible to show that there always exists an order for processing the tree so that alpha-beta examines as few of the terminal positions
as possible; no algorithm can do better. This can be demonstrated by
strengthening the technique used to prove Theorem I, as we shall see.
308
,,),
309
(which are not critical positions) are not examined by the alpha-beta method,
nor are their descendants.
(3) A type 3 position p has F,.(p) < co, and it is examined during the
alpha.beta procedure by calling F2(p, alpha, + co), where F~(p) <~ alpha < oo.
Ifp is terminal, it must be exam/ned by the given algorithm. Otherwise all its
sl:ccessor positions p~ are of type 2, and they are all examined by calling
F2(p~, - co, -alpha). (There is no need to permute them, the ordering makes
absolutely no difference here.)
A similar argument can be given when the root value is + co (treating it as
a type 2 position) or - c o (type 3).
A surprising corollary of this proof is that the ordering of successors to
type 3 positions in an optimally-ordered tree has absolutely no effect on the
behavior o f alpha-beta pruning. Type 1 positions constitute the so-called
"principal variation", corresponding to the best strategy by both players.
The alternative responses to moves on the principal variation are of type 2.
Type 3 positions occur when the best move is made from a type 2 position,
and the successors of type 3 positions are again of type 2. Hence about half
of the critical positions of a perfectly ordered game tree are of type 3, and
current game-playing algorithms are probably wasting nearly half of the
time they now spend trying to put successor moves in order.
Let us say that a game tree is uniform of degree d and height h if every
position on levels 0~ 1 , . . . , h - i has exactly d successors, and if every
position on level h is terminal. For example, Fig. 1 is a uniform tree of
height 4 and degree 3, but the trees of Fig. 4 are not uniform. Since all
permutations of a uniform tree are uniform, Theorem 2 implies the following
generalization of Corollary 1.
COROLLARY3. Any algorithm which evaluates a uniform game tree of height
h and degree d must evaluate at least
d rh/zl + d th/z' - 1
(17)
terminal positions. The aiFha-beta procedure achieves this lower bound, if the
best move is consideredfirst at each position of types 1 and 2.
7. Uniform Trees Without Deep Cutoffs
Now that we have determined the best case of alpha-beta pruning, let's be
more pessimistic and try to look at the worst that can happen. Given any
finite tree, it is possible to find a sequence of values for the terminal positions
so that the alpha-beta procedure will examine every node of the tree, without
making any cutoffs unless the tree branches are permutco. (To see this,
arrange the values so that whenever F2(p, alpha, beta) is called, the condition
-alpha > F(pj) > F(pz) > . . . > F(p~) > - b e t a is satisfied.) On the other
Artbqclal lntelflgence 6 (1975), 293-326
310
R. W. M O O R E
D.E. K N U T H A N D
hand, there are game trees with distinct terminal values forwhich thealphabeta procedure will always find some cutoffs no matter how the branches
are permuted, as shownin Fig. 5. (Procedure FI does not enjoy this property.)
/\..l\.
! \ ./,,. !\. / \
/\/\/\/\/\/\/'\/\
al
a2
bl
b2
a3
a4
b3
b4
a5
a6
b5
be
a7
a8
b7
b8
FXG. 5. I f max(ab . . . . a s ) < min(bh ...,be), the alpha-beta procedure will always find at
least two cutoffs, no matter how we permute t i c branches of this game tree.
3ll
I
11
1 314 3/5
1 9/14 9/20
Y22 Y23
Y31 Y32
\
Y33
312
(18)
B.+l = A. + S, + IC.;.i:
c.+i
= A, +
313
A(z) = ~ A, z ~,
n~-O
to
1 = zA(z)
1 = zA(z)
1 = zA(z)
U(z)/V(z),
U(z) = de
C(z) = E C, z',
n~>O
+ .zB(z) + zC(z),
+ ~}zB(z)+ ~}zC(z),
+ ,-~4zB(z) + 2-~zC(z).
where
~ z - 1 ]z
(19)
,
(2o)
V(z)
det
.
z-I ] z
~-g4z ~f-6z--1
are polynomials in z. If the equation z 3 V(l/z) = 0 has distinct roots rl, r z, r3,
there will be a partial fraction expansion of the form
.1
A(z) = 1 - rlz
c2
-- r2z
C3
1 -- r3z
(21)
where
c, = -r,U(1/rt)/r'(l/r~).
(22)
Consequently A(z) = ~.)o(cl(rlz)" + cz(rzz)" + ea(raz)'), and we have
A, =
+ c2 z +
by equating coeflicie,Rs ofz,. Ifwe number the roots so that {rl [ > It21 >~ Ir3[
(and the theorem of Perron [17] assures us that this can be done), we have
asymptotically
A, .., clr ~.
(23)
Numerical calculation gives rt - 2.533911, ci---- 1.162125; thus, the alphabeta procedure without deep cutoffs i n a random ternary tree w/ll examine
about as many nodes as in a tree of the' same height with average degree
2,534 instead of 3. (It is worthwhile to note that (23) -redicts about 48
positions to be examined on: the fourth level, while on:~; 35 occurred in
Fig. 2; the reason for this discrepancy is chiefly that the one-digit values in
Fig. 2 are nonrandom because of frequent equalities.)
Elementary manipulation of determinants shows that the equation z 3 V(l/z)
= 0 is the same as
det
1- z
I
1
-z
z/
=0;
6 (1975), 293-326
314
We might have deduced this directly from eq. (18), if we had known enough
matrix theory to calcolat the constant cl by matrix-theoretic means instead
of function-theoretio means.
This solves the case d .-- 3. For general dwe find similarly that the expected
number of terminal positions examined by the alpha-beta procedure without
deep cutoffs, in a random uniform game tree of degree d and height h, is
asymptotically
T(d, h) ,.. co(d) ro(d) ~'
(24)
for fixed d as h ~ 00, where re(d) is the largest eigenvalue of a certain d x d
matrix
rPll
P12
"'"
Ptd~
P21
P22
---
P24
Pdt
Pd2
(25)
M d --
"'-
Pdd
and where co(d) is an approl ~riate consent. The general matrix element p~j
in (25) is the probability that
max (min(Ylt,..., Y~d)) < rain Y~
(26)
l~k<l
1~/< /
( i - 1)d
( i - 2)d
~
Pu = (i - D d + j ' -
--l/(i-It21-1)/d
..
d
.
I (i - 2)d + j - - 1
).
d + i - I
(27)
315
The form of (27) isn't very convenient for asymptotic calculations; there is
a much simpler expression which yields an excellent approximation:
LEMMA 1. When 0 <~ x <. 1 and k is a positive integer,
k-',~<(k-l+x)
k-
(a8)
(Note that 0.885603 < F(l + ac) ~< l for 0 ~< x ~< 1, with the minimum
value occurring at x = 0.461632; hence the simple formula k x is always
within about 11 ~o of the exact val!ue of the binomial coefficient.)
+x)
~
(m+x)
"''
(m + 1)" = 1-(1 + x)
k - 1
For trees of height 2, deep cutoffs are impossible, and procedures F I and
F2 have an identical effect. How many or" the d 2 positions at level 2 are
examined ? Our analysis gives an exact answer for this case, and Lemma 1
can be used to give a good approximate result which we may state as a
theorem.
THEOR~ 3. The expected number of terminal positions examined by the
alpha-beta procedure on level 2 of a random uniform game tree of degree d is
T(d, 2) =
p.
O0)
l ~ l , j ~d
(31)
F(l + x) and
Artificial Intelligence 6 (t975), 293-326
316
l e~t,j~d
= d +
d--I
When the values of re(d) for d ~< 30 are plotted on log log paper, they seem
to be approaching a straight line, suggesting that re(d) is approximately of
order d '75. In fact, a least-squares fit for 10 <~ d ~g 30 yielded d '~6 as an
approximate order of growth; this can be compared to the lower bound
2d -5 of an optimum alpha-beta search, or to the upper bound d of a full
minimax search, or to the estimate d 'Tz obtained by Fuller et al. [7] forrandom
alpha-beta pruning when deep cutoffs are included, However, we shah see
that the true order of growth of re(d) as d -, ao is really d/log d.
There is a moral to this story: If we didn't know the theoretical asymptotic
growth, we would be quite content to think of it as d -76 when d is in a
practical range. The formula d/log d seems much worse than d -Ts, until we
realize the magnitude of log din the range of interest. (A similar phenomenon
occurs with respect to Shell'S sorting method, see [12, pp. 93-95].) On the
basis of this theory we may well regard the approximation d -Tz in [7] with
some suspicion.
But as mentioned above, there is a much more significant moral to this
story. Formula (24) is incorrect because the proof overlooked what appears to
be a rather subtle question of conditional probabilities. Did the reader spot
a fallacy ? The authors found it only by comparing their results to those of
[71 in the- ~ h = 3,~d = 2, since procedures Fi: and-F2 are equivalent for
heights ~< 3. According to the analysis above, the alpha-beta procedure will
examine an average of ~- nodes on level 3 o f a random binary game tree, but
according to [7] the number is ~ .
After th~ authors o f [7] were politely
informed that they must have erred, since we had proved that 679was correct,
~:hey politely replied that simulation results (including a test on all 8!
permutations) had confirmed that the corre: answer is 6i~s.
A careful scrutiny of the situation explains what is going on. Theorem 3 is
correct, sivce it deals only with level 2, but trouble occurs at level 3. Our
theory predicts a cutoff on the right subtree Of every B node with probability
~}, so that the terminal 'values ( f ~ , . . - , r e ) in Fig. 7 will be examined with
respective probabilities (1, t, I, ~,:1, 1, 3, ~), Actually fs is examined with
probability ~ instead of ~; for fs is examined if and only if
ArtifwialIntellioence6 (1975), 293.326
317
f7 > min(j~,f6),
(32)
min(fs,f6) < max(mm(fi, f2), min(fs,f,)).
Each of these two events has probability {, but they are not independent.
A
IA
(IB
QA
(lib
OA
(lib
QA
(liB
tl
#2
f3
f4
vs
fe
f7
f8
When the fallacy is stated in these terms, the error is quite plain, but the
dependence was much harder to s~; in the diagrams we had been drawing for
ourselves. For example, when we argued using Fig. 6 that the second successor
of a B position is examined with probability , we neglected to consider that,
when p is itself of type B or C, the B node in Fig. 6 is entered only when
rain(y11, Y12, Yt3) is less than the bound at p; so rain(y, t, Yt2, Yl 3) is somewhat smaller than a random value would be. What we should have computed
is the probability that Yzt > min(ytl, Yl2, Y~3) given that position B is not
cut off. And unfortunately this can depend in a very complicated way on the
ancestors ofp.
To make matters worse, our error is in the wrong direction, it doesn't even
provide an upper bound for alptm-~eta searching; it yields only a lower
bound on an upper bound (i.e., nothing). In order to get information relevant
to the behavior of procedure F2 on random data, we need at least an upper
bound on the behavior of procedure F1.
A correct analysis o f the binatry case (d 2 ) involves the solution of
recurrences
A.+t -- An + B.)'
Bi,+) ,= Aa + ~tBCt t) for k ~ 0,
Ao =
03)
l,
As>f13Aft,t,
f13 A fi* < (f, Afro) V (fit Aft2),
(34)
(fg^ fto)V (fit A f t 2 ) > ( ( f l ^ f , ) V (f3 ^ f,)) A ((fs A f) v (f'n Afs)),
Artificial Intelligence 6 (1975), 293-326
318
writing v for max and ^ for min. These probabilities can be computed
exactly by evaluating appropriate integrals, bug the formulas are complicated
and it is easier to look for upper bounds. We can at least show easily that the
probability in (34) is g } , since the fLrStand third conditions are independent,
and they each hold with probability ~. Thus we obtain an upper bound if we
set Po - - P , ffi P4 ffi . - - ffi ~ and Pt = Pa = . - . -- 1; this is equivalent to
the recurrence
A o - B o = 1~
= An + en,
(35)
en+~ = A. + }A,.
Similarly in the case of degree 3, we obtain an upper bound on the average
number of nodes examined without deep cutoffs by solving the recurrence
Ao-BoffiCo---1,
A.+I = An + Bn + C.,
S.+~ = A . + ~}A. + ~jAn.
C.+, = A. + ~2~A. + z~oAn,
in place of (18). This is equivalent to
A.+I -- A. + (I + + t + 1 + ~ + ~o)An-t
and for general degree d we get the recurrence
An+l ffi An + $~An-l,
where Ao --- 1, Al --- d, and
(36)
(37)
(38)
This gives a valid upper bound on the behavior of procedure F1, because it is
equivalent to setting bound,-- + o o a t certain positions (and this operation
never decreases the number of positions examined). Furthermore we can
solve (37) explicitly, to obtain an asymptotic upper bound on T(d, h) ofthe
form ct(d)r~(d) s, wherethe growth ratio is
rl(d) ffi ~/(Sd + ) + .
(39)
Unfortunately it turns out that Sa is o f order d2/log d, by Theorem 3; so
(39) isof order d/~/logd, while an upper boundofbrder d/log dis desired.
Another way to get an upper bound reties On amore detailed analysis of
the structural behaviorof procedure F1, as in the following theorem.
319
O a ~
~/P2a
g~
(40
/Pdl
~Pd2
~Pdd
a~ has
i<~k<ar_
(42)
X < p , ~ , ~a4as"
Pah',oh;
.Poh_,
hence
and the theorem follows by choo~.~ing c*(d) large enough.
~.T~.w.w..we are ready to establish the correct asymptotic growth rate of the
branchmg facLoz for procedure F l.
(43)
320
which satisfies
d~
..
(44)
h-* eo
"
r e ( d ) ~ C rain
( Z i'u
l--z
', .Ii.~
C min(
2~,.~,kl1
d -t
- Cl.d.~/,z
1'~'~
-')
d-1
> C -J'~d-'
where C = 0.885603 = info~=~t F(1 + x), since d "ila ffi exp(-lnd/d) >
I
in did.
To get the upper bound in (44), we shall prove that r*(d) < C4 d/log d,
using a rather curious matrix norm. If s and t are positivereal numbers with
1 I
-+
= i,
(46)
-
. . . .
.....
pt(E
x.t
I.-,.~,~
r
"
I
'
a*
*It
~1
t#t
321
- -
lX~il)t/s
( ~ ( ' ~ la~jl)s/')I/*/~
v/d-'('i-')/2d)s") '/'"
(48)
d -f/td)
as d --~ ~ .
ii
ul
i,
ii
ro(d)
rl(d)
r*(d)
ro(d)
rl(tO
r*(d)
1.847
3
4
5
6
7
8
9
10
11
12
13
14
15
16
2.534
3.142
3.701
4.226
4.724
5.203
5.664
6.!12
6.547
6.972
7.388
7.795
8,195
8:589
1.884
2.666
3.397
4.095
4.767
5.421
6.059
6.684
7.298
7.902
8.498
9.086
9.668
10.243
10.813
! .912
2.722
3.473
4.186
4.871
5.532
6.176
6.805
7.420
8.024
87618
9'203
9.781
10'350
10.913
17
18
19
2O
21
22
23
24
23
26
27
28
29
30
31
8.976
9.358
9.734
10.106
10.473
10.836
11.194
11.550
11.901
12.250
12.595
12.937
13.277
13.614
13.948
11378
11.938
12.494
13.045
13.593
14d37
14.678
15.215
15.750
16.282
16.811
17.337
! 7,861
18.383
18.903
11.470
12.021
12.567
13.108
13.644
14.176
14.704
15.228
".5.748
16.265
16.778
17., ]8
17.796
18.300
18.802
ii
t ll
I lll
II I I
ii
Table 1 shows the various bounds we have obtained on r(d), namely the
lower bound to(d) and the upper bounds rl(d) and r*(d). We have proved
that to(d) and r*(d) grow as d/log d, and that rl(d) grows ~as d/v/log d; but
the table shows that rl(d) is actually a better bound for d ~ 24.
322
(49)
C,+t = An,
hence Ani - A, -I-2A._i. For general d the corresponding recurrence is
A o = 1, A1 - d, An+ 2
The solution to this recurrence is
1
An - x/(4d
An+t + ( d -
1)An.
(50)
(51)
32:3
is based on the piece count in a chess game, all the positions following a
blunder will tend to have low scores for the player who loses his men.
In this section we shall try to account for such dependencies by considering
a total dependency model, which ha~ the following property for all nonterminal positions p: For each i and A all of the terminal successors of p~
either have greater value than all terminal successors of p j, or they all have
lesser value, This model is equivalent to assigning a permutation of {0, 1 , . . . ,
d - 1} to the moves at every position, and then l~sing the concatenation of all
move numbers leading to a terminal position as that position's value,
considered as a radix-d number. For example, Fig. 8 shows a uniform
ternary game tree of height 3 constructed in this way.
/0
/1
_/
~.
,~)_
201
.021
210 201
102. 201
102. 021. 120.
~
II \
I
\
I
. ~\
102 1(~0101/ [ ~122121 120/ I ~021 0~0022/ I ~20'i 2~00~202/~ zL~1222220
110 112 111
.012010 011
002 O0Q001
210 212 211
/
d ---//~,~,~rht21
~,.+
Ha dt~12j
Hi+ i
_ Hi ) + H i ,
(52)
where Ha = 1 + + . . . + l/d.
Proof. As in our other proofs, we divide the positions of the tree into a
finite number of classes or types for which recurrence relations can be given.
In this case we use three types, somewhat as in our proof of Theorems I and 2.
A type 1 position p is examined by calling F2(p, alpha, beta) where all
terminal descendants q of p have alpha < +f(q) < beta; here the + or sign is used according as p is an even or an odd number o f kvcls from the
bottom of the tree. I f p is nonterminal, its successors ere assigned a definite
ranking; let us say that p~ is relevant ifF(pi) < F(pj) for all 1 ~. j < i. Then
Artificial Intelligence 6 (1975), 293-326
324
D.E. K N U T H . A N D
R. W. M O O R E
all of the relevant successors ofp axe examhted by calling F2(p~, -beta, - m )
where F(pi) lies between-beta and - m , hence the relevant p~ are again of
type 1. The irrelevant p~ are examined by calling F2(pa, -beta, - m ) where
F(pi) > - m , and we shall call them type 2.
A type 2 position p is examined by calling F2(p, alpha, beta) where all
terminal descendants q of p have -If(q) > beta. I f p is nonterminal, its first
successor Pt is classified as type 3, and it is examined by calling F2(pl, -beta,
-alpha). This procedure call eventually returns a Value ~< -beta, causing
an immediate cutoff.
A type 3 position p is examined by calling F2(p, alpha, betd) where all
terminal descendants q of p have-l-f(q)< alpha. If p i s nonterminal, all its
successors are classified type 2, and they ar~ examined by calling F2(pi,
-beta, -alpha); they aH return values >~ -alpha.
Let An, Bn, C, be the expected number of terminal positions examined in a
random totally dependent uniform tree of degree d and height h, when the
root is of type 1, 2, or 3 respectively. The above argument shows that the
following recurrence relations hold:
A o - Bo = Co = 1,
A.+I = An + (A. + On) + ( An +
en) + . . .
+((I/d)An + ( ( a " l)td)B,)
-//dAn+ ( d - Hd)e,,
(53)
B.+I
Cn+l = dB~.
Consequently B, -- d tn/2j, and As has the value stated in (52).
COROLLARY 4. When d >1 3, the average number of positions examined by
alpha.beta search under the assumption of totallydependent terminal values is
bounded by a constantS times the optimum number of positions specified in
Corollary 3.
Proof. The grO~:dl of (52) as h --, oo is order d s/2. The stated constant is
approxitaately
( d - Hd)(l + ttd)/2(d. 1t2).
(Whenld= 2 the growth rate of (52) is order (~)~ instead o f ~/2s,)
Incidentally, we can also analyzeprocedureFlunder ~the same assumptions;
the restriction Of deep cutoffs leads to the recurrence
Ao -- 1, AI - 1, A~+2 --HdA~+I + (d - Hd)A,,
(54)
and the corresponding growth rate is of order (~/(d - Ha + H2) + Hd)h.
So againthe branching:factor is approximately ~/d for large d.
sThis "constant" depends on the degree d, but not onthe height ,~,
Artillcialhaell~ence6 0975), 293-.326
325
AN ANALYSISOF ALPHA-BETAP~UNING
The authors of [7] have suggested another model to account for dependencies between positions: Each branch O.e., each arc) o f the uniform game
tree is assigned a random number b:tween 0 and 1, and the values of terminal
positions are taken to be the sums of all values on the branches above. If we
apply the naive approach of Section 7 '!o the analysis of this model without
deep cutoffs, the probabdity needed in place of eq. (26) is the probability that
max (Xk + min(Y~l,..., Ykd)) < X~ +
l<~k<i
rain Yl~,
(55)
l~k<j
where as before the Y's are independent and identically distributed random
variables, and where X1,. ., X: are independent uniform random variables
in [0, I]. Balkema [2] has shown that (55) never occurs with greater probability
than the value Pu derived in Section 7, regardless of the distribution of the
Y's (as long as it is continuous). Therefore we have good grounds to believe
that dependencies between position values tend to make alpha-beta pruning
more efficient than it would be if all terminal positions had independent
values.
ACKNOWLEDGMENTS
We wish to thank J. R. Slagie, whose lecture at Caltech in 1967 originally stimulated some
of the research reported here, and we also wish to thank Forest Baskett, Robert W. Floyd,
John Gasclmig, James Gillogly, John McCarthy and James H. Wilkinson for discussions
which contributed significantlyto our work. Computer time for our numerical experiments
was supported in part by ARPA and in part by IBM Corporation.
1.
2.
3.
4.
5.
6.
7.
8.
9.
REFERENCES
Adelson-Velskiy, G. M., Arl~,~rov, V. L., Bitman, A. R., Zhivotovskii, A. A. and
Uskov, A. V. Programming a computer to play chess. Uspehi Mat. Nauk 25 (2)
(March-April 1970), 221-260 (in Russian).
Balkema, G. Personal communication, July 19, 1974.
Bernstein, A., Roberts, M. De V., Arbuckle, T. and Belsky, M. A. A chess-playing
program for the IBM 704 computer. Proe. Western Joint Computer Conference 13
(1958), 157=i59.
Brudno, A, L. Bounds and valuations for shortening the scanning of variations.
Problemy Kibernet. 10 (1963), 141-150 (in Russian).
Dahl, O.-J. and Belsnes, D. Algoritmer og Datastrukturer. Studentlitteratur, Lund
(1973).
Floyd, R. W. Personal communication, January 18, 1965.
Fuller, S. H., Gaschnig, J. G. and Gillogly, J. J. Analysis of the alpha-beta pruning
algorithm. Dept. of Computer Science, Carnegie-Mellon UniversitY, Pittsburgh, Pa.
(July 1973), 51 pp.
G~enblatt, R. D., Eastlake, D. E. and Crocker, S. D. The Greenblatt chess program.
Proc, AFIPS Fall Joint Computer Conference 31 (1967), 801-810.
Hardy, G. H., Littlewood, J. E. and P61ya, G. Inequalities. Cambridge Univ. Press,
London (1934).
Arti/icial Intelh'gence 6 (1975), 293-326
326
I0. Hart, T. P. and Edwards, D. J. The tree prune (TP) algorithm. M.I.T. Artificial
Intelligence Project Memo # 30, R.L.E, and Computation Center, Massachusetts
Institute of Technology, Cambridge, Mass. (December 41 1961), 6 pp; revised form:
The o~-~heuristic, D. J. Edwards and T. P. Hart (October 28, 1963), 4 pp.
11. Knuth, D. E. Fundamental Algorithms: The Art of Computer Programming 1. AddisonWesley, Reading, Mass. (1968]1973).
12. Knuth,r D. E. Sorting and Searching: The Art of Computer Programming 3. AddisonWesley, Reading, Mass. (1973).
13. Knuth, D. E. Structured programming with go to statements. Comput. Surveys 6 (1974),
261-301.
I4. Lawler, E. L. and Wood, D. E. Branch-and-bound methods: A survey. Operations Res.
14 (1966), 699--719.
15. McCarthy, J. Personal communication, December 1, 1973.
t6. Neweli, A., Shaw, J. C. and Simon, H . A. Chess-playing programs and the problem
of complexity. IBM J. Res. and Develop. 2 (1958), 320-355; reprinted with minor
corrections in Computers and Thought, E. A. Feigenbaum and J. Feldmau, eds.,
McGraw-Hill, New York (1963), 1.09-133.
1% Nievergelt, J., Farrar, J. C. and Reingold, E . M . Computer Approaches to Mathematical Problems. Prentice-Hall, Englewood Cliffs, N.J. (1974).
18. Nilsson, N, J. Problem-Solving Methods in Artificial Intelligence. McGraw-Hill, New
York (! 971).
19. Perron, O. Zur Theorie tier Matrizen. Math. Ann. 64 0907), 248-263.
20. Pblya, G. and Szeg6, G. Aufgaben und Lehrsc~tze aus der Analysis 1. Springer, Berlin
(1925).
21. Samuel, A. L. Some studies in machine learning using the game of checkers. IBMJ. Res.
and Develop. 3 (1959), r2!1-229; reprinted with minor additions and corrections in
Computers and Thought, E. A. Feigenbaum and $. Feldman, eds., McGraw-Hill, New
York (1963), 71-105.
22. Samuel, A. L. Some studies in machine learning using the game of checkers. II--Recent
progress. IBM J. Res. and Develop. 11 (1967), 601-617.
23. Slagle, J. R. ArtO'icial Intelligence: The Heuristic Programming Approach. McGrawHill, New York (1971).
24. Slagle, J . R. and Bursky, P. Experiments with a multipurpose, theorem-proving
heuristic program. J.ACMI$(t968), 85-99,....
25. Slagle, J. R. and Dixon, J . K . Experiments with some programs that search game
trees. J.ACM 16 (1969), 189-207.
26. Varf,a, R.~ S., Matrix lterative Analysis, Prentice,Hall, Englewood Cliffs, N.J. (1962).
27. Wells, M. B. Elements of Combinatorial Computing. Pergamon, Oxford(1971).