A Multipurpose Backtracking Algorithm
A Multipurpose Backtracking Algorithm
net/publication/257253260
CITATIONS READS
47 6,226
2 authors:
All content following this page was uploaded by Martin Ward on 25 October 2017.
Abstract
A backtracking algorithm with element order selection is presented, and its efficiency dis-
cussed in relation both to standard examples and to examples concerning relation-preserving
maps which the algorithm was derived to solve.
1 Introduction
Backtracking has long been used as a strategy for solving combinatorial problems and has been
extensively studied ( Gerhart & Yelowitz (1976), Roever (1978), Walker (1960), Wells (1971)).
In worst case situations it may be highly inefficient, and a systematic analysis of efficiency is very
difficult. Thus backtracking has sometimes been regarded as a method of last resort. Nevertheless,
backtracking algorithms are widely used, especially on NP-complete problems. In order to make
these algorithms computationally feasible on a range of large problems, they are usually tailored
to particular applications (see, for example, Butler and Lam’s approach to isomorphism-testing
in Butler & Lam (1985) and Knuth and Szwarcfiter’s approach to topological sorting (that is,
extending partial orders to linear orders) Knuth & Szwarcfiter (1974)).
Our approach to backtracking is based on Ward’s work on program transformations Ward
(1989), Ward (1992), Ward (1994), Ward (1993). We derive (and simultaneously prove correct)
a ‘universal’ simple backtracking algorithm. Even in this rudimentary form our algorithm proved
remarkably effective for the type of problem for which it was devised. These problems can all be cast
as problems requiring the counting, listing, or otherwise processing, of the relation-preserving maps
from a finite relational structure to another relational structure of the same type. In particular
isomorphism-testing would come under this umbrella. Priestley was concerned specifically with
problems arising in connection with Stone type dualities for varieties of algebras whose members
were distributive lattices with additional structure. It turned out that, in these applications, the
running time of the algorithm depended critically on the order in which the data elements were
listed (by a factor of several thousand). The same techniques that yielded the simple backtracking
algorithm were then employed to derive a version of the algorithm which incorporates a mechanism
for permuting elements. By exploiting this in various ways enormous improvements in efficiency
were obtained which enabled us to complete various calculations which would otherwise have been
totally impractical. Tables 3 and 4 in Section 6.2 strikingly illustrate the effect of judicious element
order selection in one particular case.
Our paper is aimed at two groups of readers with (probably) small intersection. The first group
consists of those interested in backtracking per se. The second group contains mathematicians
who need to solve problems in, for example, algebra or graph theory, to which our methods can be
applied. For the benefit of such readers we have included some discussion of aspects of programming
1
folklore which would not have been required in a paper directed solely at computer scientists.
The paper is organised as follows. Section 2, which uses the well-known eight queens puzzle
as an illustration, serves two purposes. It provides a brief introduction to backtracking for those
unfamiliar with it, and also allows us to draw attention to the factors affecting efficiency which we
address later. Section 3 presents the fragment of Ward’s Wide Spectrum Language (WSL) which
we use, and Section 4 contains the theory from Ward’s work on program transformations on which
our algorithm derivations are based. The simple backtracking algorithm is given in Section 5.
The next section discusses the applications of this algorithm out of which the paper has arisen.
It provides the mathematical background to a range of examples on which we have tested our
methods. While Section 6 is reasonably self-contained it is aimed primarily at mathematicians
with appropriate interests. Section 7 discusses various heuristics for element order selection, with
illustrations. We conclude with some brief comments relevant to further developments: we discuss
the state of the art concerning complete automation of the process of algorithm development, from
abstract specification to implementation in a suitable programming language.
We stress that an understanding of the machinery in Sections 4–5 is not needed by users of
the end product. The theory guarantees that the algorithm meets its specification. Because of
its universal character, the algorithm (with or without element order selection) can very easily
be adapted to a variety of situations without further recourse to the theory. Implementation is
straightforward. We discuss implementation issues in a special case in Section 6. We also include
an Appendix which gives a C implementation of the simple backtracking algorithm. The source
code for all the algorithms and sample data files can be obtained from the authors.
The brute force solution for any combinatorial problem is to enumerate all the possible solutions,
testing each in turn and rejecting those which fail to meet the required conditions. In this case, the
“most brutish” method tests every possible arrangement of eight queens on a chessboard. There are
64 places for the first queen, for each of these there are 63 places for the second queen, and so on, for
2
a total of 64 × 63 × · · · × 57 = 178,462,987,637,760 cases. This number can be reduced substantially
by the observation that any valid solution must contain exactly one queen in each column. So we
only need to consider the 88 = 16,777,216 ways of placing eight queens, one per column, into eight
columns. Any such arrangement can be represented as a sequence of eight numbers from 1 to 8,
for example the situation in Figure 1 is represented as h3, 6, 4, 1, 8, 5, 7, 2i.
2.1 Backtracking
A simple way to reduce the number of cases still further now suggests itself. Consider the situation
where the first two queens have been placed in positions 1 and 1, or 1 and 2 in the first two columns.
Since these two are attacking each other, we need not consider any of the 2 6 = 262,144 ways of
placing the remaining six queens. Similarly, after placing the first four queens in Figure 1, there
are only two valid positions for the fifth queen. Such a sequential placement can be represented as
a tree structure, as shown in Figure 2 for the four queens puzzle. The four nodes below the root
» »»»AH PP
HH PP
» »»» A HH PP
» »»» A
P
HHPPP
»» »»» A HH PPP
1 » »»» AA 2 H
H3
PP 4
P
"PPP
" B PP ¡£B@
" B PP ¡£ B@
" PP
" B PP 14 ¡ £ B @
"" B 13 P ¡ £ B @ 24
11 12 ¡£B@ ¡£B@ 21 22 23 ¡£B@
¡£ B@ ¡£ B@ ¡£ B@
¡ £ B @ ¡ £ B @ ¡ £ B @
¡ £ B @ ¡ £142 B @ 241 ¡ £ B @
131 132 133 134 141 ¡£B@ 143 144 ¡£B@ 242 243 244
¡£ B@ ¡£ B@
¡ £ B @ ¡ £ B @
¡ £ B @ ¡ £ B @
1421 1422 1423 1424 2411 2412 2413 2414
(top) node of the tree represent the four positions for the first queen. Below each valid node are
further nodes representing the positions for the next queen to be placed. Note that branches 3 and
4 of the tree are mirror images of branches 2 and 1 respectively, and are omitted for brevity. The
solutions are h2, 4, 1, 3i and its mirror image h3, 1, 4, 2i.
This procedure cuts the number of cases examined (for the eight queens puzzle) to a total of
15,720. To enumerate systematically all these cases we start at the root and move down the tree,
taking a leftmost branch at each junction, but if it is impossible to move down we “backtrack” by
considering the next junction at the previous level. This may lead to further backtracking if all the
junctions at the previous level have now been covered. The first computerised formulation of this
method was by Walker in 1958 ( Walker (1960)).
Assuming we have a predicate valid(p) which tests if the sequence of integers p is a valid
arrangement of queens with no queen attacking any other, then the following recursive procedure
will solve the problem. (The notation p ++ hti denotes the sequence p with the singleton sequence
hti appended. See Section 3 for a description of the other notation).
begin
count := 0;
Queens(hi)
where
proc Queens(p) ≡
if `(p) = 8 then count := count + 1
3
else for t := 1 to 8 do
if valid(p ++ hti) then Queens(p ++ hti) fi od.
end
This is a special case of the algorithm we will derive in Section 5. Also in Section 5 we transform
this recursive algorithm into an equivalent iterative algorithm:
count := 0;
var p := hi, t := 1 :
while p 6= hi ∨ t 6 8 do
last
if t > 8 then t ←− p; t := t + 1
elsif valid(p ++ hti) ∧ `(p) = 7 then count := count + 1; t := t + 1
elsif valid(p ++ hti) ∧ `(p) < 7 then p := p ++ hti; t := 1
else t := t + 1 fi od end
The recursive program emphasises downward movement in the tree. Backtracking (upward move-
ment) occurs as a matter of course when each position in the column has been considered. The
iterative program emphasises backtracking by explicitly searching up and down the tree, working
from left to right. The four cases in the loop deal with:
1. Moving up, i.e. backtracking;
2. Moving right when a solution has been found;
3. Moving down to the leftmost branch of the current node, and;
4. Moving right when the current arrangement is invalid.
only one place in column 6. Placing the next queen in column 6 (rather than column 4) will reduce
the total number of cases to be considered, without affecting the result. This is the basis for the
various “element order selection” heuristics discussed below.
It should be pointed out that for this particular problem (and the more general N queens
problem), the heuristics do not provide all that much benefit. This is because:
4
1. The size of the total search tree only increases by a factor of around 2 or 3 when a random
element order is chosen rather than the optimal order (so a large reduction in the number of
trials is not possible);
2. Using the method in Wirth (1971), a trial solution can be tested very efficiently (so there is
not much to be gained from a small reduction in the number of trials);
3. The “naı̈ve” solution of placing the queens in left to right order turns out to be the optimal
order (if the element order is fixed throughout the calculation).
For the problems we were interested in solving, a suitable element order is critical in producing
a result within a feasible amount of time. Even with this problem though, by starting with a
random permutation each time we were able to produce some improvements by using the heuristics
discussed in Section 7. See Figure 1 for some sample results, each of which is averaged over ten
different random initial permutations. The “pre-analysis” method analyses the search tree to select
an initial permutation. The “bush pruning” method dynamically updates the permutation as the
search proceeds. The “hybrid” method is a combination of a small amount of pre-analysis, followed
by bush pruning. It is the most efficient in terms of the number of trials, but imposes a higher
overhead than simple pre-analysis—which is the most efficient in terms of CPU time.
5
Sequences: s = ha1 , a2 , . . . , an i is a sequence, the ith element ai is denoted s[i], s[i . . j] is the
subsequence hs[i], s[i + 1], . . . , s[j]i, where s[i . . j] = hi (the empty sequence) if i > j. The
length of sequence s is denoted `(s), so s[`(s)] is the last element of s.
Sequence concatenation: s1 ++ s2 = hs1 [1], . . . , s1 [`(s1 )], s2 [1], . . . , s2 [`(s2 )]i.
Stacks: Sequences are also used to implement stacks, for this purpose we have the following
pop
notation: For a sequence s and variable x: x ←− s means x := s[1]; s := s[2 . . `(s)] which
pops an element off the stack into variable x. To push the value of the expression e onto
push
stack s we use: s ←− e which represents: s := hei ++ s.
last
Queues: The statement x ←− s removes the last element of s and stores its value in the variable
x. It is equivalent to x := s[`(s)]; s := s[1 . . `(s) − 1].
Sets: We have the usual set operations ∪ (union), ∩ (intersection) and r (set difference), ⊆
(subset), ∈ (element), ℘ (powerset). { x ∈ A | P (x) } is the set of all elements in A which
satisfy predicate P . For the sequence s, set(s) is the set of elements of the sequence, i.e.
set(s) = { s[i] | 1 6 i 6 `(s) }. The expression #A denotes the size of the set A.
6
Each of the “guards” B1 , B2 , . . . , Bn is evaluated, one of the true ones is selected and the
corresponding statement executed. If no guard is true then the statement aborts. If several
guards are true, then one of the corresponding statements is chosen nondeterministically.
Deterministic iteration: while B do S od The condition B is tested and S is executed re-
peatedly until B becomes false.
Uninitialised local variables: var x : S end Here x is a local variable which only exists within
the statement S. It must be initialised in S before it is first accessed.
Initialised local variables: var x := t : S end This is an abbreviation for var x : x := t; S end.
The local variable is initialised to the value t. We can combine initialised and uninitialised
variables in one block, for example: var x := t, y : S end where x is initialised and y is
uninitialised.
Counted iteration: for i := b to f step s do S od is equivalent to:
var i := b :
while i 6 f do
S; i := i + s od end
Unbounded loops and exits: Statements of the form do S od, where S is a statement, are
“infinite” or “unbounded” loops which can only be terminated by the execution of a statement
of the form exit(n) (where n is an integer, not a variable or expression) which causes the
program to exit the n enclosing loops. To simplify the language we disallow exits which
leave a block or a loop other than an unbounded loop. This type of structure is described in
Knuth (1974) and more recently in Taylor (1984).
7
terminate by calling the Z action (which causes immediate termination). Such action systems are
called regular.
if B then S1 ; S else S2 ; S fi
8
4.3 Split Block
If the statement S2 assigns a new value to x before it accesses it then the block: var x : S1 ; S2 end
can be split into two blocks: var x : S1 end; var x : S2 end
do S1 ; S2 od
Lemma 4.2 Selective unrolling of while loops: For any condition Q we have:
For each of these transformations there is a generalisation in which, instead of inserting the “un-
rolled” part after S, it is copied into an arbitrary selection of the terminal positions in S.
The converse transformations are, naturally, called loop rolling and entire loop rolling.
9
The expression t need not be an integer: any set Γ which has a well-founded order 4 is suitable.
To prove that the value of t is reduced it is sufficient to prove that if t 4 t0 initially, then the
assertion {t ≺ t0 } can be inserted before each occurrence of S0 in S[S0 /F ]. The theorem combines
these two requirements into a single condition:
Theorem 4.4 If 4 is a well-founded partial order on some set Γ and t is an expression giving
values in Γ and t0 is a variable which does not occur in S then if for some premiss P
then
P ⇒ (S0 ≤ proc F ≡ S.)
It is frequently possible to derive a suitable procedure body S from the statement S 0 by applying
transformations to S0 , splitting it into cases etc., until we get the statement S[S0 /F ] which is still
defined in terms of S0 . If we can find a suitable variant function for S[S0 /F ] then we can apply the
theorem and refine S[S0 /F ] to proc F ≡ S. which is no longer defined in terms of S0 .
As an example we will consider the familiar factorial function. Let S0 = r := n!. We can
transform this (by appealing to the definition of factorial) to get:
10
There are M +N actions in total: A1 , . . . , AM , B1 , . . . , BN . Note that the since the action system is
regular, it can only be terminated by executing call Z which will terminate the current invocation
of the procedure.
The aim is to remove the recursion by introducing a local stack L which records “postponed”
operations: When a recursive call is required we “postpone” it by pushing the pair h0, ei onto L
(where e is the parameter required for the recursive call). Execution of the statements S jk also
has to be postponed (since they occur between recursive calls), we record the postponement of S jk
by pushing hhj, ki, xi onto L. Where the procedure body would normally terminate (by calling
Z) we instead call a new action F̂ which pops the top item off L and carries out the postponed
operation. If we call F̂ with the stack empty then all postponed operations have been completed
and the procedure terminates by calling Z.
Theorem 4.5 A recursive procedure in the form:
proc F (x) ≡
actions A1 :
A1 ≡ S 1 .
. . . Ai ≡ S i .
. . . Bj ≡ Sj0 ; F (gj1 (x)); Sj1 ; F (gj2 (x)); . . . ; F (gjnj (x)); Sjnj .
. . . endactions.
where Sj1 , . . . , Sjnj are as above, is equivalent to the following iterative procedure which uses a new
local stack L and a new local variable m:
proc F 0 (x) ≡
var L := hi, m :
actions A1 :
A1 ≡ S1 [call F̂ /call Z].
. . . Ai ≡ Si [call F̂ /call Z].
. . . Bj ≡ Sj0 ; L := hh0, gj1 (x)i, hhj, 1i, xi, h0, gj2 (x)i, . . . , h0, gjnj (x)i, hhj, nj i, xii ++ L;
call F̂ .
. . . F̂ ≡ if L = hi then call Z
pop
else hm, xi ←− L;
if m = 0 → call A1
t
u ... t u m = hj, ki → Sjk ; call F̂
. . . fi fi. endactions end.
Note that any procedure F (x) can be restructured into the required form; in fact there may be
several different ways of structuring F (x) which meet the required criteria.
Consider the recursive factorial program we derived above (Section 4.6):
proc X ≡ if n = 0 then r := 1 else n := n − 1; X; n := n + 1; r := n.r fi.
We can restructure this as:
proc X ≡
actions A :
A ≡ if n = 0 then r := 1; call Z else call B fi.
B ≡ n := n − 1; X; n := n + 1; r := n.r; call Z.
endactions.
This is in the right form to apply Theorem 4.5. This gives:
proc X ≡
var L := hi, m :
actions A :
A ≡ if n = 0 then r := 1; call F̂ else call B fi.
11
B ≡ n := n − 1; L := h0, 1i ++ L; call F̂ .
F̂ ≡ if L = hi then call Z
pop
else hmi ←− L;
if m = 0 → call A
t
u m = 1 → n := n + 1; r := n.r; call F̂ fi fi.
endactions end.
Notice that B pushes 0 onto L, then calls F̂ which immediately pops off the 0 and calls A. So we
could call A directly:
proc X ≡
var L := hi, m :
actions A :
A ≡ if n = 0 then r := 1; call F̂ else call B fi.
B ≡ n := n − 1; L := h1i ++ L; call A.
F̂ ≡ if L = hi then call Z
pop
else hmi ←− L;
if m = 0 → call A
t
u m = 1 → n := n + 1; r := n.r; call F̂ fi fi.
endactions end.
Now that we only ever push ones onto L, all we need to know is its length1 . So convert L to an
integer variable and remove the redundant local variable m:
proc X ≡
var L := 0 :
actions A :
A ≡ if n = 0 then r := 1; call F̂ else call B fi.
B ≡ n := n − 1; L := L + 1; call A.
F̂ ≡ if L = 0 then call Z
else L := L − 1; n := n + 1; r := n.r; call F̂ fi.
endactions end.
Now the A and B actions just copy n into L, set r to 1, set n to 0, and call F̂ . The F̂ action can
be expressed as a while loop, so we have:
proc X ≡
var L := n :
r := 1; n := 0;
while L 6= 0 do L := L − 1; n := n + 1; r := n.r od end.
Note that L reaches zero when n reaches its original value, so we can write the while loop as a for
loop:
proc X ≡ r := 1; for i := 1 to n do r := i.r od.
This is an efficient factorial algorithm, derived from the specification given in Section 4.6
12
This picks elements from the (finite) set I in an arbitrary order and executes S once for each
element.
Lemma 4.6 If I1 and I2 partition I (i.e. I = I1 ∪ I2 and I1 ∩ I2 = ∅) then the for loop refines to
the pair of loops:
Proof: The proof is by induction on the size of the (finite) set I using transformations 4.2, 4.3
and 4.5.
By induction on this lemma we get the more general result:
S
Lemma 4.7 Suppose the finite set I is partitioned as j∈J Ij where the sets Ij are disjoint. Then
the for above refines to the double nested loop:
for i ∈ I do S od ≤ for j ∈ J do
for i ∈ Ij do S od od
13
of SPEC(p) where the value of p is “smaller” according to some well-founded ordering. First we
introduce an if statement to take out special cases. We may assume that SPEC(p) is only called
when V (p) is true since an invalid p can have no valid extensions. We know that SPEC(p) ≈
process(p) if C(p) is true, so we introduce an if statement which tests C(P ). If C(p) is false then
all the values we want to process must be strictly greater than p, so we have:
SPEC(p) ≈ if C(p) then process(p)
else for q ∈ { p ++ q | q ∈ D ∗ ∧ V (p ++ q) ∧ C(p ++ q) } do process(q) od
where we know that each element of the set we loop over will be longer than p. So we can write
this set as a union of disjoint subsets:
[
{ p ++ hti ++ q ∈ D ∗ | V (p ++ hti ++ q) ∧ C(p ++ hti ++ q) } .
t∈D
This means we can refine the loop to a double loop (by Lemma 4.7):
SPEC(p) ≤
if C(p) then process(p)
else for t ∈ D do
for q ∈ { p ++ hti ++ q ∈ D ∗ | V (p ++ hti ++ q) ∧ C(p ++ hti ++ q) } do
process(q) od
If ¬V (p ++ hti) then { p ++ hti ++ q ∈ D ∗ | V (p ++ hti ++ q) ∧ C(p ++ hti ++ q) } = ∅ and the for
loop refines to skip:
SPEC(p) ≤
if C(p) then process(p)
else for t ∈ D do
if V (p ++ hti)
then for q ∈ { p ++ hti ++ q ∈ D ∗ | V (p ++ hti ++ q) ∧ C(p ++ hti ++ q) } do
process(q) od
So we have:
SPEC(p) ≤ if C(p) then process(p)
else for t ∈ D do
if V (p ++ hti) then SPEC(p ++ hti) fi od
We know that the set { p ++ q | q ∈ D ∗ ∧ V (p ++ q) ∧ C(p ++ q) } is finite, so there is an upper
limit to the length of valid sequences, say L. So we can use L − `(p) as a variant function and
introduce recursion using Theorem 4.4:
SPEC(p) ≤ proc processall(p) ≡
if C(p) then process(p)
else for t ∈ D do
if V (p ++ hti) then processall(p ++ hti) fi od.
We have renamed the recursive procedure F , provided by Theorem 4.4, to processall and made p
a parameter of this procedure.
14
D in order. This is because the value of t tells us which elements of D have yet to be processed, in
fact D 0 = { i ∈ N | t < i 6 D }:
SPEC ≤
var p := hi :
processall end
where
proc processall ≡
if C(p) then process(p)
else t := 1;
while t 6 D do
last
if V (p ++ hti) then p := p ++ hti; processall; t ←− p fi;
t := t + 1 od.
The procedure processall processes all valid extensions of (the global variable) p in a particular
order. Our specification is “incomplete” in the sense that it doesn’t specify the order in which the
valid and complete elements are to be processed. An implementation of the specification is thus
free to choose the most convenient order.
Restructure the procedure body as an action system:
proc processall ≡
actions A :
A ≡ if C(p) then process(p); call Z
else t := 1; call A1 fi.
A1 ≡ if t 6 D then if V (p ++ hti) then call B1
else call A2 fi
else call Z fi.
A2 ≡ t := t + 1; call A1 .
last
B1 ≡ p := p ++ hti; processall; t ←− p; call A2 . endactions.
This is now in the right form for applying the recursion removal transformation (Theorem 4.5).
There is one “B-type” action (B1 ) which contains one recursive call. So S10 = p := p ++ hti and
last
S11 = t ←− p; call A2 . Removing the recursion we get:
proc processall ≡
var L := hi, m :
actions A :
A ≡ if C(p) then process(p); call F̂
else t := 1; call A1 fi.
A1 ≡ if t 6 D then if V (p ++ hti) then call B1
else call A2 fi
else call F̂ fi.
A2 ≡ t := t + 1; call A1 .
B1 ≡ p := p ++ hti; L := h0, h1, 1ii ++ L; call F̂ .
F̂ ≡ if L = hi then call Z
pop
else m ←− L;
if m = 0 → call A
last
t
u m = h1, 1i → t ←− p; call A2 fi fi. endactions end.
5.3 Optimisation
As with the factorial algorithm (Section 4.7) we push 0 onto L and immediately pop it and call A.
So we can avoid the push and call A directly. As before, we now have a stack of identical elements
which could be implemented as an integer. In this case however, we can do even better, since the
length of L is the same as the length of p, so we can test p instead of L and remove L altogether:
15
proc processall ≡
actions A :
A ≡ if C(p) then process(p); call F̂
else t := 1; call A1 fi.
A1 ≡ if t 6 D then if V (p ++ hti) then call B1
else call A2 fi
else call F̂ fi.
A2 ≡ t := t + 1; call A1 .
B1 ≡ p := p ++ hti; call A.
F̂ ≡ if p = hi then call Z
last
else t ←− p; call A2 fi. endactions.
Remove the action system and restructure:
SPEC ≈
var p := hi, t :
do if C(p) then process(p);
if p = hi then exit fi;
last
t ←− p; t := t + 1
else t := 1 fi;
do if V (p ++ hti) ∧ t 6 D then exit
elsif p = hi ∧ t > D then exit(2)
last
elsif t > D then t ←− p; t := t + 1
else t := t + 1 fi od;
p := p ++ hti od end
Take the first statement out of the loop and convert to a single loop. We will assume C(hi)
is false (since otherwise no other sequences can be valid) and define C 0 (t, p) =DF C(p ++ hti),
V 0 (t, p) =DF V (p ++ hti), process 0 (t, p) =DF process(p ++ hti).
var p := hi, t := 1 :
do if V 0 (t, p) ∧ t 6 D then p := p ++ hti;
last
if C(p) then process(p); t ←− p; t := t + 1
else t := 1 fi;
elsif p = hi ∧ t > D then exit
last
elsif t > D then t ←− p; t := t + 1
else t := t + 1 fi od end
Push the statement p := p ++ hti into the inner if statement:
var p := hi, t := 1 :
do if V 0 (t, p) ∧ t 6 D then if C 0 (t, p) then process 0 (t, p); t := t + 1
else p := p ++ hti; t := 1 fi;
elsif p = hi ∧ t > D then exit
last
elsif t > D then t ←− p; t := t + 1
else t := t + 1 fi od end
Finally, re-arrange the tests to make a while loop:
var p := hi, t := 1 :
while p 6= hi ∨ t 6 D do
last
if t > D then t ←− p; t := t + 1
elsif V 0 (t, p) then if C 0 (t, p) then process 0 (t, p); t := t + 1
else p := p ++ hti; t := 1 fi
else t := t + 1 fi od end
16
This is our basic backtracking algorithm. The derivation used only transformations which have
been proved to preserve the semantics ( Ward (1989), Ward (1991a), Ward (1992), Ward (1994))
so we can guarantee that this algorithm correctly implements the specification SPEC.
6 Some Applications
This section outlines the problems which gave rise to the algorithms presented in this paper. These
problems concern the concrete representation, by functions or by sets, of algebraic structures.
Stone duality for Boolean algebras provides a prototype for such representations. We outline the
mathematical background shortly, but begin by describing the form of the backtracking algorithm
which we need.
Without loss of generality we may take X and Y to be sets of integers: X = {1, 2, . . . , #X} and
Y = {1, 2, . . . , #Y }. We can represent a partial map Ψ : {1, 2, . . . , n} → Y (where n 6 #X) as
a sequence p of length n where p[i] = Ψ(i). A complete sequence is one of length #X and a valid
sequence is a relation preserving one. So we have the definitions:
Any subset of a (partial or total) relation preserving map is also relation preserving, so these
definitions clearly satisfy the conditions for the backtracking algorithm.
The iterative algorithm actually uses V 0 (t, p) (defined as V 0 (t, p) = V (p ++ hti) which is only
evaluated when V (p) is true. This means that we only need to check the pairs (x, y) where one or
both of x or y is equal to t. So we can use the definitions:
where n = `(p) + 1 (so V 0 (t, p) is testing if t is a valid image for n in the extension of p from
{1, 2, . . . , n − 1} to {1, 2, . . . , n}).
For the implementation we only need to record the sizes of X and Y (in variables SX and SY ).
We represent the two sets of relations using two three-dimensional integer arrays RX and RY (we
could use boolean arrays but integer arrays are probably slightly faster to access and memory is
not at a premium). The integer rho represents the relation ρ where:
( (
1 if x ρ y 1 if x ρ0 y
RX[rho, x, y] = and RY [rho, x, y] =
0 otherwise 0 otherwise
17
The relation-preserving test is implemented as a double-nested while loop which stores the result in
Boolean variable rp. The loops terminate as soon as rp becomes false, to avoid unnecessary testing.
The variable R records the number of relations. So we have the following testing procedure which
sets rp to true iff V 0 (t, p) is true, provided n = `(p) + 1 and V (p) is true:
proc DO(t, p, n) ≡
var np := 0 :
rp := true; rho := 1;
while rho 6 R ∧ rp do
if RX [rho, n, n] = 1 ∧ RY [rho, t, t] = 0
then rp := false
else np := 1;
while np < n ∧ rp do
if (RX[rho, n, np] = 1 ∧ RY [rho, t, p[np]] = 0)
∨ (RX [rho, np, n] = 1 ∧ RY [rho, p[np], t] = 0)
then rp := false fi od fi od end.
Note that this version will repeatedly search for the set of elements related to a particular element
in X for each relation. If the X relations are fairly sparse (not many pairs of elements related) then
it will be more efficient to represent the X relations using two integer arrays rel to X and rel X to
which record the following information:
18
1. Take P to be the 2-element distributive lattice 2 = ({0, 1}; ∨, ∧, 0, 1) (so 0 and 1 are treated
as nullary operations). Then A is the class D of {0, 1}-distributive lattices.
2. Take P to be the 2-element Boolean algebra ({0, 1}; ∨, ∧,0 , 0, 1). Then A is the class B of
Boolean algebras.
3. By taking P to be ({0, a, 1}; ∨, ∧, ∼, 0, 1), where ({0, a, 1}; ∨, ∧, 0, 1) is the {0, 1}-distributive
lattice with 0 < a < 1 and the negation operator ∼ satisfies ∼0 = 1, ∼1 = 0 and ∼a = a, we
obtain the class K of Kleene algebras.
In these and in other logic-based examples P has an underlying lattice structure, with the
operations ∨ and ∧ modelling disjunction and conjunction. We shall henceforth always assume
that P has a lattice reduct, since this simplifies the theory on which we rely (though we note
that interesting work lies ahead on classes of algebras where this restriction is not satisfied). In
many cases it happens that A = ISP(P ) coincides with the variety HSP(P ), where H denotes the
formation of homomorphic images. Then, by a famous theorem of G. Birkhoff, A can be specified
by a set of identities. This occurs for each of D, B and K above. (Where the quasivariety ISP(P is
strictly smaller than the variety HSP(P ), all is not lost. However a more complicated representation
theory, using multi-sorted structures, is then required (see Davey & Priestley (1987)).)
Given A = ISP(P ), we may seek a concrete representation for the algebras in A . Another
important question to address is the determination of the free algebra F A (n) on n generators
(note that the free algebras in HSP(P ) lie in ISP(P ) always, so that for this problem it is sufficient
to consider classes of the latter form). A major systematic study of the representation of algebras,
in a manner which represents the free algebras in a very natural way, was undertaken by B.A. Davey
and H. Werner in Davey & Werner (1983). Their theory encompasses many well-known dualities
(including Stone duality for B and Priestley duality for D) within a common framework. In the
present paper it is finite structures that concern us. Accordingly we shall restrict to the finite
algebras in A . This spares us having to introduce the topological machinery involved in representing
arbitrary algebras. The representation we require relies on an appropriate choice of a relational
structure P = (P ; R) on the underlying set P of P . Given any set R of relations on P we extend
∼
each ρ ∈ R pointwise to powers of P . For each finite A ∈ A define the dual of A to be D(A), where
D(A) is the set A (A, P ) of A -homomorphisms from A into P , with relational structure inherited
from P A . Then, from X = D(A) we form the algebra E(X), defined to be the set of R-preserving
∼
maps from X into P , with algebraic structure inherited from P X . Then the theory in Davey &
∼
Werner (1983) (specifically Theorem 1.18) implies that we have the following theorem.
Theorem 6.2 Assume that A = ISP(P ) is a class of algebras such that P is a finite algebra with
an underlying lattice structure. Then it is possible to choose R so that
1. A ∼
= ED(A) for each finite A ∈ A , and
2. D(F A (n)) = P n (so that F A (n) is the algebra of all R-preserving maps from P n to P
∼ ∼ ∼
( 1 6 n < ∞)).
Further, R above can be chosen to consist of binary relations, each of which is a subalgebra of P 2 .
If (1) in Theorem 6.2 holds we say that R yields a duality on (the finite algebras in) A . For
Boolean algebras we take R = ∅, while for D we obtain Priestley duality by choosing R on {0, 1}
to contain the single relation 6, the inequality relation in which 0 < 1 (so 2 is the 2-element chain,
∼
qua ordered set). In these examples a suitable set R was recognised with hindsight, the dualities
being known before the Davey–Werner theory was developed. Since the publication of Davey
& Werner (1983) various techniques (notably Davey and Werner’s piggyback method) have been
devised which make it quite easy to identify a set R which will yield a duality on A . However such
a set may often be too large and complex to lead to a workable duality. This is strikingly illustrated
by subvarieties of the variety Bω of distributive p-algebras. These varieties were first discussed by
K.B. Lee Lee (1970). He showed that the proper non-trivial subvarieties of B ω form a chain
19
B0 ⊂ B1 ⊂ . . . (in which B0 = B and B1 is the class known as Stone algebras). These varieties may
be defined equationally. Alternatively, Birkhoff’s Subdirect Product Theorem implies that they are
equivalently given by Bn = ISP(P n ), where P n = (2n ⊕ 1; ∨, ∧, ∗ , 0, 1) denotes the n-atom Boolean
lattice with a new top adjoined, and with an operation ∗ of pseudocomplementation given by
a∗ = max { c | a ∧ c = 0 } .
To avoid degenerate cases we henceforth assume n > 3. As shown in Davey & Priestley (1993a),
a duality is obtained for Bn by taking P n = (Pn ; Rn ), where the set Rn of relations consists of
∼
(i) the graphs of 3 endomorphisms, e, f , g of P n , and
(ii) a set Tn of subalgebras of P 2n indexed by the partitions of the integer n.
In (i), f and g are automorphisms and are determined by the permutations they induce on the
atoms of P n , viz. the cycles (1 2 . . . n) and (1 2). In (ii) |Tn | grows exponentially with n, and it is
natural to ask whether any proper subset of Rn still serves to yield a duality. AP partition of n into
k parts is a k-tuple (λ1 , . . . , λk ) of natural numbers where λ1 > · · · > λk and ki=1 λi = n. Two
of the partition-induced relations in Tn are isomorphic as algebras precisely when the associated
partitions have the same number of parts. An optimistic but reasonable conjecture was that a
duality would be obtained by reducing Tn by selecting just one k-part partition for each k (giving
n + 3 relations in total).
Given a set R of subalgebras of P 2 that is known to yield a duality for a class A = ISP(P ),
how might we test whether a proper subset R0 = R r {ρ} still yields a duality? Certainly if the
reduced set R0 fails to give A ∼
= ED(A) for just one A ∈ A then the relation ρ cannot be discarded.
Dropping a relation cannot decrease the size of ED(A), so the point at issue is whether the number
of R0 -preserving maps from D(A) to (P ; R0 ) is greater than the size of the test algebra A. We say
R0 yields a duality on A if no extra maps become allowable.
When trying to decide whether ρ ∈ R can be discarded a natural choice for a test algebra A is
ρ, by which we mean the relation ρ regarded as an algebra (remember that each of our relations is
a subalgebra of P 2 ). Here (at last!) we have a computational problem: compare the size of A with
the size of the set ED(A) of R0 -preserving maps from D(A) into (P ; R0 ). The very earliest version
of our backtracking algorithm (an implementation in VAX BASIC) successfully demonstrated that
none of the partition-induced relations could be discarded from the duality for B 3 (as expected,
since each partition of 3 has a different number of parts). The critical test case for our conjecture
came with n = 4 and the relations ρ1 and ρ2 associated with the 2-part partitions (2, 2) and (3, 1).
Here D(ρ1 ) = D(ρ2 ) has 42 elements, |P4 | = 17, and |R4 | = 8. We sought to calculate the number
of maps Ψ : D(ρi ) → P 4 preserving R4 r{ρi } (i = 1, 2). These calculations were successfully carried
∼
through on a PC, but only after a judicious choice of ordering of the elements of the domain had
been made. Before indicating how element order affects the calculations we conclude the history
of the Bn problem.
It turned out that dropping either ρ1 or ρ2 did not destroy the duality on the test algebra ρ1 = ρ2
(though dropping both did). This negative result was consistent with the conjecture that only one
of the relations was needed, but did not prove it. At this point, examination of the computer output
provided sufficient insight to enable the conjecture to be confirmed mathematically for n = 4, and
subsequently for general n. Much more significantly it led Davey and Priestley ( Davey & Priestley
(1993b)) to prove a theorem of which the following is a special case.
Theorem 6.3 Assume A is as in Theorem 6.2, that R yields a duality on A , and let R 0 = Rr{ρ}
(ρ ∈ R). Then R0 yields a duality on A if and only if R0 yields a duality on the algebra ρ.
Thus testing for redundancy of any given relation in a duality is reduced to a finite problem,
solvable by application of the backtracking algorithm (subject of course to computational feasibil-
ity).
20
Language Machine Time (simple) Time (improved)
VAX BASIC VAX not available not available
GW BASIC 286 PC 2 hrs 24 mins not available
GW BASIC 386 PC 49 mins not available
C (gcc compiler) Sun 3/50 3.9 seconds 0.84 seconds
Turbo PASCAL 286 PC 7 mins 30 secs 5.1 secs
Turbo PASCAL 386 PC 1 min 52 secs 1.2 secs
C (gcc compiler) Sparc 2 0.43 seconds 0.08 seconds
We now return to computational aspects of the B4 problem. The elements of the domain set
fall into disjoint orbits under the action of the automorphisms f and g. Also, if Ψ is relation-
preserving, then if i is mapped to Ψ(i) then f (i) must be mapped to f (Ψ(i)) and g(i) to g(Ψ(i))).
These observations lead us to order the domain in the following way. We start from an arbitrary
element, denoted 1, and take 2 = f (1), 3 = g(1) (unless f (1) = g(1)). Thereafter we pick as the
next element in the order the f - or g-image of the first listed element whose images have not already
been included until the orbit of 1 is exhausted. This process is repeated for the remaining orbits.
We thereby get a highly economical search tree, of 17,391 nodes. Note that the element ordering
heuristics of Section 7 are very good at finding such chains of relationships automatically—even
starting from a random permutation, they have so far always managed to find a better permutation
than the best “hand crafted” efforts! For example, a typical calculation for the B 4 problem yielded
a search tree with 10,336 nodes.
Some comparative timings for the B4 problem for different implementations are shown in
Table 2.
The VAX BASIC version was the first to be implemented, it was never actually used for this
problem which is why the times are not available. The first implementation on a Sun 3/50 was
written in perl which is an interpreted language more suited to string processing than numerical
processing. This gave timings roughly similar to the GW BASIC version. Switching to a compiled
language, i.e. C on the Sun 3/50, produced a dramatic improvement in speed which encouraged us
to tackle some much larger examples.
Tables 3 and 4 illustrate the importance of a “good” permutation for this problem, Table 3
shows the effect of making small changes to a good permutation. The permutation for each entry
in the table is produced by composing the previous permutation with a permutation of the form
(i i + 1 . . . j) where i < j. For example composing (1 2 3 4 5 6 7 8 9) with (4 5 6) yields
(1 2 3 5 6 4 7 8 9). We call this operation an insertion. Table 3 shows the effect of a sequence of
Table 3: The effect of random insertions on a “good” permutation for the B 4 problem
random insertions starting with the permutation used above. Table 4 shows what happens to the
same problem when a random permutation is chosen (computing this table required over 1 week
of CPU time on a Sparc 2). Each pair of results is for a different random permutation with the
same problem as above. The “estimated” values for the search tree were calculated using Knuth’s
21
Estimated Actual Estimated Actual
138,568,972,267 > 10, 000, 000, 000 637,837,147,299 > 10, 000, 000, 000
190,500,933 190,764,514 16,911,522,238 > 10, 000, 000, 000
174,773,383 174,780,145 530,343,262,662 > 10, 000, 000, 000
55,741,733,482,643 > 10, 000, 000, 000 740,322,532 724,756,716
868,614,966 867,753,321 259,217,324 257,759,780
17,217,321,614 > 10, 000, 000, 000 32,654,546,016 > 10, 000, 000, 000
122,313,962 121,979,114 28,751,130 28,464,800
557,074,692 554,723,651 441,680,890 441,982,864
275,179,396,064 > 10, 000, 000, 000 27,659,921,478 > 10, 000, 000, 000
31,348,549,817 > 10, 000, 000, 000 16,474,273 16,314,730
3,574,075,557 3,568,600,866 106,754,057,688 > 10, 000, 000, 000
2,342,038,092 2,347,470,755 6,362,152,171 6,417,860,672
454,900,453 454,932,869 14,821,585,522 > 10, 000, 000, 000
backtracking estimation method ( Knuth (1975)) with 1,000,000 probes, see Section 7.4 for details.
Using Knuth’s estimation method on 487 random permutations with 1,000,000 probes each yielded
an average search tree size of 3,145,968,416,638 nodes. This corresponds to an execution time (on a
Sparc 2) of about 3 years, while the hybrid method requires an average of 16 seconds and examines
a total of about 800,000 nodes.
For a larger problem of the same type (testing a duality for optimality) with a 153 element
domain, a 33 element range and 7 relations, the average search tree size for a random permutation
is around 5 × 1035 which indicates an execution time of 5 × 1023 years (about 30 million million
times the age of the visible universe). The hybrid method reduces this to 37,09,801 nodes and 763
seconds.
22
2 is always viable for m 6 26 , and is viable in many instances for m 6 27 , but with worst case
∼
behaviour which renders our algorithm impractical in those cases.
Now assume, as in 6.2, that a set R of relations on P gives a duality for a class of algebras
A = ISP(P ). By Theorem 6.3, the free algebra F A (n) is given by the R-preserving maps from
P n to P ), where P = (P ; R). Both theory and experience tell us that for algebras arising in
∼ ∼ ∼
algebraic logic (the classes Bn , various classes of Heyting algebras, etc.) the norm is that these free
n
algebras grow exceedingly rapidly with n, the more so if |P | > 2. For example, |F B(n)| = 2 2 , and
|F K(3)| = 43, 918 while |F K(4)| = 160, 297, 985, 276 ( Berman & Mukaidono (1984)). Nevertheless
we have successfully used our algorithm on some problems of this sort: see for example Priestley
(1992).
The relation-preserving maps algorithm was devised to enable a given duality on a class of
algebras A to be tested for optimality. Initially the input data files were set up by laborious hand
calculation. In more recent applications of the technique ( Davey & Priestley (1992), Priestley
(1992)) the backtracking algorithm has been used to generate these data files. To see why this
might be possible, recall that a domain set D(A) (as in 6.2) is itself a set of maps, namely the
A -homomorphisms from A into P . Such maps are just those which preserve the relations (not
in general binary) which are the graphs of the operations. Of course, the procedures described in
Section 6.1 easily adapt to relations of different arities. In the examples so far analysed, the algebras
in A have always had an underlying distributive lattice structure, so that the full machinery of
Priestley duality has been at our disposal. This has allowed us to work not with homomorphisms
but with their dual equivalents, which are certain order-preserving maps; this is done throughout
Davey & Priestley (1992), Davey & Priestley (1993a), Davey & Priestley (1993b) and Priestley
(1992). We thus gain the benefit of the “logarithmic” feature of the duality. Further, duality has
often allowed us to identify explicitly relations from a theoretical algebraic description (this was
done, for example, for Bn in Davey & Priestley (1993a)). These calculations, once again, are done
by a suitable application of the algorithm given in Section 6.1. The programs used here form part
of a package which is an invaluable toolkit for anyone investigating algebras with an underlying
distributive lattice structure. This kit includes, in particular, facilities for finding (in many classes
of algebras) homomorphisms, congruences, subalgebras and retracts, and for isomorphism-testing.
Duality theory for distributive lattices has been very extensively used, partly because, being
pictorial, it is exceedingly easy to work with. Representations do exist for arbitrary lattices which
generalise that provided by Priestley duality. That given by A. Urquhart ( Urquhart (1977))
replaces ordered sets by structures with two quasi-order relations, while G. Hartung’s theory (
Hartung (1992)) employs the formalism of concept analysis (introduced by R. Wille in Rival (1982),
pp. 445–470. See also Ganter, Wille & Wolff (1987) or Chapter 11 of Davey & Priestley (1990)).
The backtracking algorithm is ideally suited to making these representations into a practical tool.
With the aid of this package, it is possible completely to automate the generation of optimal natural
dualities in a large number of cases, for example, for certain Heyting algebra varieties (see Davey
& Priestley (1993a)). Each such calculation requires many different subroutines, each of which
employs backtracking in a different way.
23
illustrated in tables 3 and 4. In this section we will discuss various heuristics which we have used
to select the element order: these have frequently enabled us to complete calculations which would
otherwise be totally unfeasible.
We consider the same specification as in Section 5 except that all sequences are the same length,
N , and a valid sequence is one which has no “unfilled” positions, i.e. C(p) =DF ∀i, 1 6 i 6 N .p[i] 6=
⊥ where ⊥ is a new element (not in D) which is used to represend an unfilled position.
SPEC =DF for p ∈ { p ∈ D ∗ | V (p) ∧ C(p) ∧ `(p) = N } do process(p) od
Instead of extending this to process all valid extensions of p, we need to process all valid completions
of p where an incomplete sequence is a “partially filled array”. Let p ∈ (D ∪ {⊥}) ∗ be of length N .
Then we define:
SPEC(p) =DF for q ∈ { q ∈ D ∗ | p v q ∧ V (q) ∧ C(q) ∧ `(q) = N } do process(q) od
where p v q means that q is p with some of its unfilled elements filled in, i.e. p v q =DF ∀i. 1 6
i 6 N.(p[i] = ⊥ ∨ p[i] = q[i]).
By using a permutation π : {1, 2, . . . , N } → {1, 2, . . . , N } and a variable n to record how
many elements of p are filled we can avoid the need for the extra element ⊥. The permutation also
records in which order the array is to be filled. The n elements p[π[1]], . . . , p[π[n]] of p are filled,
and the N − n elements p[π[n + 1]], . . . , p[π[N ]] of p are currently unfilled. A derivation similar to
that in Section 5 yields the following algorithm:
var n := 0, t := 1 :
while n > 0 ∨ t 6 D do
if t > D then t := p[π[n]]; n := n − 1; t := t + 1
elsif V 0 (t, p, π, n) then if n = N then process 0 (t, p); t := t + 1
else n := n + 1; p[π[n]] := t; t := 1 fi
else t := t + 1 fi od end
This gives the same result as SPEC for any permutation π : {1, 2, . . . , N } → {1, 2, . . . , N }. In fact
we can permute the elements of π[d + 1], to π[N ] at any time during the execution of the program.
The selection of suitable values for π is crucial: we have used two basic heuristics to achieve
this, which may be combined to form a third hybrid method. These are called “pre-analysis” and
“bush pruning”.
In the algorithms below we fill in the array p in the order given by π. However, in the C
implementation we update the arrays rel X to and rel to X whenever π changes (in effect, changing
the map between integers and elements of X). This improves efficiency by eliminating most of the
accesses to π.
24
2. Pick the permutation whose search tree has the smaller number of leaf elements; if they have
the same number of leaf elements, pick the permutation whose total search tree is smaller.
We use a version of the basic backtracking algorithm to count the size of the search tree and
number of leaf elements for the partial permutation π[mindepth . . maxdepth], where the elements
π[1 . . mindepth − 1] have already been “frozen” (see below). This terminates immediately if the
number of trials required for the calculation exceeds cutoff (each evaluation of V 0 (t, p, π, n) is one
“trial” since these evaluations dominate the whole computation):
proc calculate(mindepth, maxdepth, cutoff ) ≡
count := 0; trials := 0;
var n := mindepth; t := 1 :
while (n > 0 ∨ t 6 D) ∧ trials 6 cutoff do
if t > D then t := p[π[n]]; n := n − 1; t := t + 1
else trials := trials + 1;
if V 0 (t, p, π, n) then if n = maxdepth
then count := count + 1; t := t + 1
else d := d + 1; p[π[d]] := t; t := 1 fi
else t := t + 1 fi od end.
We repeatedly test random “shifts” (where at least one of i or j must be within the partial
permutation) to see if the partial permutation can be improved. After a certain number of failed
attempts we assume that this is the best we can do for this size of partial permutation, so we
increase the size by adding one element, and then attempt to improve this larger permutation.
Note that in general, the “best” permutation of size n + 1 is not a simple extension of the “best”
permutation of size n. We have a “budget” which limits the number of times we want to evaluate
the V 0 function (since this is the most expensive part of the algorithm). Once this budget has been
used up we “freeze” the current partial permutation, and start building a new partial permutation
with the remaining elements. Once all the elements have been used up we put together the “frozen”
partial permutations to get a complete permutation which is used to attempt a full calculation. We
have another budget for the full calculation and if this is exhausted before the calculation finishes
then we halt the full calculation, double both analysis and calculation budgets and start again from
scratch. The program prints messages as it proceeds (which may be captured in a log file) so that
the user can monitor its progress.
Thus the pre-analysis routine works by increasing the size of the current partial permutation,
stored in π[mindepth . . depth], by incrementing depth, and then adjusting the permutation to min-
imise the search tree. The subroutine find good depth element tries inserting each of the elements
π[depth + 1] to π[SX ] in position depth to find the best one. It updates global variables count
and trials with the number of leaf nodes in the search tree for π[mindepth . . depth] and the total
number of nodes in the tree. find good insert repeatedly picks a random pair of elements in π (at
least one of which must be within π[mindepth . . depth]), inserts one in place of the other, and tests
if this improves the permutation. It terminates when it runs out of budget (the total number of
trials allowed), or it has tried maxgoes insertions without improving the partial permutation. It
sets the global variable inserts done to the number of good insertions found.
mindepth := 1;
for depth := 1 to SX − 1 step 1 do
do find good depth element(mindepth, depth, maxtrials/10);
best result := count; best trials := trials;
if best trials > maxtrials/10
then print(“Best next element exceeded:”, maxtrials/10,
“ trials at depth:”, depth, mindepth);
mindepth := depth;
for i := 1 to mindepth step 1 do p[i] := 0 od
25
else exit fi od;
budget := maxtrials;
printinfo(mindepth, depth, best trials, best result);
if depth > mindepth
then retries := 1
do budget := maxtrials − best trials;
find good insert(mindepth, depth, best trials, best result);
best result := count; best trials := trials;
printinfo(mindepth, depth, best trials, best result);
retries := retries + 1;
if retries > maxretries ∨ inserts done = 0 then exit fi od od
The assignments to p[i] when mindepth is increased are to indicate to the calculate routine that
relations involving these elements do not have to be preserved. The element “0” can be thought of
as a new element, added to the set B, which is related to itself and everything else in every relation
in RY .
The procedure printinfo prints a status report on the progress of the calculation, this includes
the total CPU time used, and the CPU time used since the last status report.
Once this routine terminates (when depth = SX − 1) we use the calculate routine with a budget
of total trials (the total number of trials used by the pre-analysis). If this fails by exceeding its
budget then we double maxtrials and maxgoes and run the pre-analysis again with this larger
budget. This will hopefully result in a better permutation for the next full calculation, which in
any case will have a larger budget to use. Thus our time will be divided roughly equally between
pre-analysis and attempted calculations.
Note that this heuristic may take up to four times longer than necessary if an attempted
calculation runs out of budget “just before” it would have completed. Also, it is not always easy
to see from the status reports how much more time will be required to finish the calculation. A
useful by-product of this method is a printout of the best permutation found.
26
measured in terms of the total number of trials, i.e. evaluations of V 0 (t, p, π, n).
The algorithm is based on the calculate algorithm with the bush pruning code added. The
procedure find good element(n) picks the element to put in π[n] which has the smallest number
of relation-preserving images in Y . find bush size(n) sets bush to the size of a “suitable” bush
for pruning: in other words, adding bush elements to the current partial permutation results in
a search tree containing about bush budget/bush goes nodes. prune bush(n, bush) then uses up
bush budget trials in attempting to improve the part of the permutation between n and n + bush.
bush trials := 0; count := 0;
n := 0; t := 1;
do if t 6 D
then total trials := total trials + 1;
if V 0 (t, p, π, n)
then if n = SX then count := count + 1; t := t + 1;
else p[π[n]] := t; n := n + 1; t := 1;
if n < 3N/4 ∧ total trials > next bush[n]
then last bush := bush trials;
find bush size(n);
prune bush(n, bush);
update next bush() fi fi
else t := t + 1 fi
else n := n − 1;
if n = 0 then exit fi;
t := p[π[n]] + 1 fi od
where
proc update next bush() ≡
for i := n to n + bush step 1 do
next bush[i] :=total trials + 5(bush trials − last bush) od.
Note that if find bush size extends the bush to the rest of the set then the current p element has
been completely fathomed (all its valid extensions have been discovered). Also if find bush size or
prune bush ever reach a bush with no leaves (no valid extensions up to n + bush) then there can be
no complete and valid extensions of the current p. In either case we can jump immediately to the
step n := n − 1, and this is what the C implementation does. The C implementation also prints a
regular status report (after each bush pr step trials).
27
7.4 Heuristics based on Knuth’s Estimation Algorithm
In Knuth (1975), Knuth presents a method for estimating the size of the search tree of a simple
backtracking algorithm. The method is based on making a number of random “probes” into the
tree, picking a random path at each stage, and computing the weighted total of the cost of the
calculation carried out at each node:
Here c(x1 , . . . ) is the cost of the computation at the node p = hx1 , . . . i (in our case these costs
are all the same so we set them all to 1), d0 is the number of initial elements x such that the
sequence hxi is valid. One of these elements, x1 , is chosen at random. For each valid sequence
p = hx1 , . . . , xi i, di+1 is the number of elements x which can be added to p to get a valid sequence.
One of these elements, xi+1 , is chosen at random. The procedure terminates when di is zero. C is
the cost estimate from this probe.
Knuth provides two “proofs” (‘at least one of which should be convincing’ ) that the expected
value of C is the cost of the complete backtracking search. He says that the method has been tested
on dozens of applications and has ‘consistently performed amazingly well, even on problems which
were intended to serve as bad examples. In virtually every case the right order of magnitude for
the tree size was found after ten trials.’ He discusses one experiment in detail (a “knight’s tour”
problem) where averaging over 1,000 random walks produced an estimate within 0.5% of the actual
answer of 3,137,317,290.
This algorithm appears to provide an ideal method for determining which of two permutations
is better (and hence for finding a good permutation), which, unlike our previous methods, takes into
account the whole permutation. Unfortunately, for most of our relation-preserving maps problems,
the estimations appeared to be less accurate than we hoped: even averaging over 100,000 probes,
and taking several minutes of CPU time on a Sparc 2, the estimates would vary by a factor of two
or more, with some problems giving wildly inaccurate estimates.
Despite these discouraging results, we implemented a permutation-selection algorithm based
on Knuth’s estimation method. The algorithm tests various potential insertions, using Knuth’s
method to see if the permutation has improved. As soon as the permutation appears to provide a
feasible search tree, the algorithm attempts a calculation. If this fails (by taking more than twice
the estimated number of trials) we assume that by averaging over more random walks we would
get a better estimate. We therefore increase the number of random walks (by say 20% to 50% at a
time) until the estimate is greater than twice the old estimate (we know that the actual search tree
size is at least this big). We define a “feasible” search tree to be one which is estimated to take less
than a quarter of the total number of trials we have carried out in the analysis so far: this means
that as the search for a good permutation takes more and more time, we will be steadily relaxing
the feasibility requirements.
Unfortunately, this heuristic method was a dismal failure! The main problems are:
1. Occasional false underestimates (even with a large number of probes)—this causes it to think
a particular random insertion is better when it is probably worse. So it takes a step away
from optimality;
2. The large number of probes required means that only a small number of insertions can be
tested. So it only takes a few steps towards optimality.
3. On the larger problems, rather than finding a “good” permutation, the algorithm merely finds
permutations for which Knuth’s method consistently underestimates the result. For example,
every estimate might be around 108 to 109 while the actual tree is orders of magnitude greater
than 109 nodes. The effect is that it keeps doing trial calculations which fail and cause the
number of probes to be increased to such a degree that the program effectively “grinds to
a halt”. In one case, increasing the sample size from 5,000 to 7,500 caused the estimate to
28
change from a quite feasible 107 trials to a totally impractical 1020 trials. The algorithm then
“improved” this permutation by finding a new one for which Knuth’s method underestimates
the tree size.
These problems are still present even when averaging over a very large number of probes, for
example with 100,000 samples which takes several minutes of CPU time on a Sparc 2 to test one
insertion. As a result, this method has been abandoned, though the code is available from the
authors. Table 5 compares the bush pruning method with the method based on Knuth’s algorithm
for some of our smallest examples.
Table 5: Bush Pruning compared to the method based on Knuth’s estimation algorithm. (“Problem
size” refers to the size of the domain and range).
29
including most of the ones used in the program derivations above. Once the remaining transform-
ations have been implemented it will be possible to carry out the derivation interactively: starting
from the formal specification and invoking a sequence of proven transformations and refinements,
with the system checking all correctness conditions at each stage, finally translating the resulting
(executable) WSL code into a suitable programming language, such as C. One interesting result
we have noticed from our experiences with manual transformation is that the kinds of (clerical and
logical) errors made in deriving an algorithm tend to be the sort of errors (for example writing <
instead of >) which are uncovered by the first few test cases. Once these errors have been cor-
rected, the programs invariably pass all test cases with flying colours. This contrasts with typical
programming bugs which tend to be subtle and extremely difficult to track down.
References
Arsac, J. (1982a): Transformation of Recursive Procedures. In: Neel, D. (ed.) Tools and Notations for
Program Construction. Cambridge University Press, Cambridge, pp. 211–265
Arsac, J. (1982b): Syntactic Source to Source Program Transformations and Program Manipulation. Comm.
ACM 22, 1, pp. 43–54
Berman, J. & Mukaidono, M. (1984): Enumerating fuzzy switching functions and free Kleene algebras.
Comput. Math. Appl. 10, pp. 25–35
Bull, T. (1990): An Introduction to the WSL Program Transformer. Conference on Software Maintenance
26th–29th November 1990, San Diego
Butler, G. & Lam, C. W. H. (1985): A General Backtrack Algorithm for the Isomorphism Problem of
Combinatorial Objects. J. Symb. Comput.
Davey, B. A. & Priestley, H. A. (1987): Generalised piggyback dualities, with applcations to Ockham algebras.
Houston J. Math. 13, pp. 151–198
Davey, B. A. & Priestley, H. A. (1990): Introduction to Lattices and Order. Cambridge University Press,
Cambridge
Davey, B. A. & Priestley, H. A. (1992): Optimal dualities for varieties of Heyting algebras. preprint
Davey, B. A. & Priestley, H. A. (1993a): Partition-induced natural dualities for varieties of pseudocomple-
mented distributive lattices. Discrete Math. 113, pp. 41–58
Davey, B. A. & Priestley, H. A. (1993b): Optimal natural dualities. Trans. Amer. Math. Soc. 338, pp. 655–
677
Davey, B. A. & Werner, H. (1983): Dualities and equivalences for varieties of algebras. In: Huhn, A. P. &
Schmidt, E. T. (eds.) Contributions to lattice theory (Szeged, 1980). (Colloq. Math. Soc. János Bolyai
no. 33.) North-Holland, Amsterdam, pp. 101–275
Dijkstra, E. W. (1976): A Discipline of Programming. Prentice-Hall, Englewood Cliffs, NJ
Ganter, B., Wille, R. & Wolff, K. (eds.) (1987): Beiträge zur Begriffsanalyse. B.I. Wissenschaftsverlag,
Mannheim, Zürich
Gerhart, S. L. & Yelowitz, L. (1976): Control Structure Abstractions of the Backtracking Programming
Technique. IEEE Trans. Software Eng. SE 2, 4, pp. 285–292
Hartung, G. (1992): A topological representation of lattices. Algebra Universalis 29, pp. 273–299
Hoare, C. A. R., Hayes, I. J., Jifeng, H. E., Morgan, C. C., Roscoe, A. W., Sanders, J. W., Sørensen, I. H.,
Spivey, J. M. & Sufrin, B. A. (1987): Laws of Programming. Comm. ACM 30, 8, pp. 672–686
Knuth, D. E. (1974): Structured Programming with the GOTO Statement. Comput. Surveys 6, 4, pp. 261–
301
Knuth, D. E. (1975): Estimating the Efficiency of Backtracking Algorithms. Math. of Comput. 29, 129,
pp. 121–136
Knuth, D. E. & Szwarcfiter, J. L. (1974): A Structured Program to Generate All Topological Sorting Ar-
rangements. Inform. Process. Lett. 2, pp. 153–157
30
Lee, K. B. (1970): Equational classes of distributive pseudo-complemented lattices. Canad. J. Math. 22,
pp. 881–891
Morgan, C. C. (1994): Programming from Specifications. Prentice-Hall, Englewood Cliffs, NJ. Second Edition
Priestley, H. A. (1992): Natural dualities for varieties of distributive lattices with a quantifier. Proceedings
of the 38th Banach Centre Semester on Algebraic Logic and Computer Science Applications to appear
Rival, I., (ed.) (1982): Ordered Sets, Reidel, Dordrecht
Roever, W. P. de (1978): On Backtracking and Greatest Fixpoints. In: Neuhold, E. J. (ed.) Formal Descrip-
tion of Programming Constructs. North-Holland, Amsterdam, pp. 621–636
Stallman, R. M. (1989): Using and Porting GNU CC. Free Software Foundation, Inc.
Taylor, D. (1984): An Alternative to Current Looping Syntax. SIGPLAN Notices 19, 12, pp. 48–53
Urquhart, A. (1977): A topological representation theory for lattices. Algebra Universalis 8, pp. 45–58
Walker, R. J. (1960): An Enumerative Technique for a class of Combinatorial Problems. In: Bellman, R. E.
& Hall Jr., M. (eds.) Proceedings of Symposia in Applied Mathematics 10: Combinatorial Analysis. Am.
Math. Soc., Providence R.I.
Ward, M. (1989): Proving Program Refinements and Transformations. Oxford University, DPhil Thesis
Ward, M. (1990): Derivation of a Sorting Algorithm. Durham University, Technical Report. hhttp: // www.
dur. ac. uk/ ∼dcs0mpw/ martin/ papers/ sorting-t.ps.gzi
Ward, M. (1991a): Specifications and Programs in a Wide Spectrum Language. Submitted to J. Assoc.
Comput. Mach.
Ward, M. (1991b): A Recursion Removal Theorem—Proof and Applications. Durham University, Technical
Report. hhttp: // www. dur. ac. uk/ ∼dcs0mpw/ martin/ papers/ rec-proof-t.ps.gzi
Ward, M. (1992): A Recursion Removal Theorem. Springer, New York Berlin Heidelberg. Proceedings of
the 5th Refinement Workshop, London, 8th–11th January. hhttp: // www. dur. ac. uk/ ∼dcs0mpw/ martin/
papers/ ref-ws-5.ps.gzi
Ward, M. (1994): Foundations for a Practical Theory of Program Refinement and Transformation. Durham
University, Technical Report. hhttp: // www. dur. ac. uk/ ∼dcs0mpw/ martin/ papers/ foundation2-t.ps.gzi
Ward, M. (1993): Abstracting a Specification from Code. J. Software Maintenance: Research and Practice 5,
2, John Wiley & Sons, pp. 101–122. hhttp://www.dur.ac.uk/∼dcs0mpw/martin/papers/prog-spec.ps.gzi
Ward, M. (1996): Derivation of Data Intensive Algorithms by Formal Transformation. IEEE Trans. Software
Eng. 22, 9, pp. 665–686. hhttp: // www. dur. ac. uk/ ∼dcs0mpw/ martin/ papers/ sw-alg.ps.gzi
Ward, M. & Bennett, K. H. (1993): A Practical Program Transformation System For Reverse Engineering.
Working Conference on Reverse Engineering, May 21–23, 1993, Baltimore MA. hhttp: // www. dur. ac. uk/
∼dcs0mpw/ martin/ papers/ icse.ps.gzi
Ward, M. & Bennett, K. H. (1995): Formal Methods for Legacy Systems. J. Software Maintenance: Research
and Practice 7, 3, John Wiley & Sons, pp. 203–219. hhttp://www.dur.ac.uk/∼dcs0mpw/martin/papers/
legacy-t.ps.gzi
Ward, M., Calliss, F. W. & Munro, M. (1989): The Maintainer’s Assistant. Conference on Software Main-
tenance 16th–19th October 1989, Miami Florida. hhttp: // www. dur. ac. uk/ ∼dcs0mpw/ martin/ papers/
MA-89.ps.gzi
Weidemann, D. A. (1991): A computation of the 8th Dedekind number. Order 8, pp. 5–6
Wells, M. B. (1971): Elements of Combinatorial Computing. Pergamon Press, New York
31
Wirth, N. (1971): Program Development by Stepwise Refinement. Comm. ACM 14, 4, pp. 221–227
double
calculate (mindepth, maxdepth, cutoff)
int mindepth, maxdepth;
double cutoff;
{
/* Calculate and return no. of relation-preserving maps:
* use mindepth to maxdepth elements of domain,
* terminate as soon as no. of trials exceeds cutoff
*/
double count;
int rho;
register short rp, n, t, np;
register int ii, numrels;
trials = 0;
count = 0;
n = mindepth;
t = 1;
for (;;) { /* loop calc_do1: */
if (t <= SY) {
trials = trials + 1;
if (trials > cutoff) {
goto calc_od1;
}
/* Start of relpres subroutine:
* if relpres(psi,t,n) then rp=1
* relpres(psi,t,n) is eqt to psi[n]=t; relpres(psi,n)
* t is the test value for psi[n], relpres(psi,n-1) is true */
rp = 1;
for (rho = 0; rho < R; rho++) {
/* test relation rho: */
/* Test elements relating to X */
numrels = rel_to_X[rho][n][0];
for (ii = 1; ii <= numrels; ii++) {
np = rel_to_X[rho][n][ii]; /* np is the iith elt related to n */
32
if (np < n) { /* psi[np] is defined: */
if (RY[rho][t][psi[np]] == 0) {
rp = 0;
goto calc_end_relpres;
}
} else {
if (np == n) { /* n is related to itself */
if (RY[rho][t][t] == 0) { /* check t is related to itself */
rp = 0;
goto calc_end_relpres;
} else { /* np > n, ie no more elts related to n */
goto calc_end_inner_1;
}
} /* fi (np == n) */
} /* fi np < n */
} /* End of inner for loop */
calc_end_inner_1:
/* Test elements X relates to */
numrels = rel_X_to[rho][n][0];
for (ii = 1; ii <= numrels; ii++) {
np = rel_X_to[rho][n][ii]; /* np is the iith elt n relates to */
if (np < n) { /* psi[np] is defined: */
if (RY[rho][psi[np]][t] == 0) {
rp = 0;
goto calc_end_relpres;
}
} else { /* already checked n relates to n case */
goto calc_end_inner_2;
} /* fi np < n */
} /* End of inner for loop */
calc_end_inner_2:;
} /* End outer for loop, next relation */
/* End of relpres subroutine */
calc_end_relpres:
if (rp == 1) {
if (n == maxdepth) {
count++;
t++;
} else {
psi[n] = t;
n++;
t = 1;
}
} else {
t++;
}
} else { /* from (t <= SY) */
n--;
if (n < mindepth) {
goto calc_od1;
}
t = psi[n] + 1;
} /* fi from (t <= SY) */
}
calc_od1:
return (count);
} /* End of calculate(mindepth, maxdepth, cutoff) */
33