Sudoku, Gerechte Designs, Resolutions, Affine Space, Spreads, Reguli, and Hamming Codes
Sudoku, Gerechte Designs, Resolutions, Affine Space, Spreads, Reguli, and Hamming Codes
r
2
, c
1
, s
1
r
2
, c
2
, s
1
r
2
, c
3
, s
5
r
2
, c
4
, s
2
r
2
, c
5
, s
2
r
3
, c
1
, s
4
r
3
, c
2
, s
5
r
3
, c
3
, s
5
r
3
, c
4
, s
5
r
3
, c
5
, s
2
r
4
, c
1
, s
4
r
4
, c
2
, s
4
r
4
, c
3
, s
5
r
4
, c
4
, s
3
r
4
, c
5
, s
3
r
5
, c
1
, s
4
r
5
, c
2
, s
4
r
5
, c
3
, s
3
r
5
, c
4
, s
3
r
5
, c
5
, s
3
3 4 5 1 2
5 1 2 3 4
2 3 4 5 1
4 5 1 2 3
1 2 3 4 5
3 5 2 1 4
2 1 4 5 3
4 3 5 2 1
5 4 1 3 2
1 2 3 4 5
3 1 5 2 4
2 5 4 1 3
4 3 2 5 1
5 4 1 3 2
1 2 3 4 5
2 1 5 3 4
3 5 4 2 1
4 3 1 5 2
5 4 2 1 3
1 2 3 4 5
3 1 5 2 4
2 5 4 1 3
4 3 1 5 2
5 4 2 1 3
1 2 3 4 5
2 4 1 5 3
3 1 5 2 4
5 3 4 1 2
4 5 2 3 1
1 2 3 4 5
Figure 1: A partitioned 5 5 grid (top left), its representation as a block
design (top right), and all inequivalent gerechte designs (bottom)
1.2 Resolvable block designs
A block design is a structure consisting of a set of points and a set of blocks,
with an incidence relation between points and blocks. Often we identify a
block with the set of points incident to it, so that a block design is represented
by a family of sets; however, the same set may occur more than once.
A block design is said to be resolvable if the set of blocks can be par-
titioned into subsets C
1
, . . . , C
r
(called replicates) such that each point is
3
incident with just one block in any replicate C
i
. The partition of the block
set is called a resolution of the design.
The search for gerechte designs for a given partitioned grid can be trans-
formed into a search for resolutions of a block design, as we now show.
The basic data for a gerechte design is an n n grid partitioned into n
regions S
1
, . . . , S
n
, each containing n cells. We can represent this structure
by a block design as follows:
the points are 3n objects r
1
, . . . , r
n
, c
1
, . . . , c
n
, s
1
, . . . , s
n
;
for each of the n
2
cells of the grid, there is a block r
i
, c
j
, s
k
, if the
cell lies in the ith row, the jth column, and the kth region.
Proposition 1.1 Gerechte designs on a given partitioned grid correspond,
up to permuting the symbols 1, . . . , n, in one-to-one fashion with resolutions
of the above block design.
Proof Given a gerechte design, let C
i
be the set of cells containing the
symbol i. By denition, the blocks corresponding to these cells contain each
row, column, or region object exactly once, and so form a partition of the
point set. Any cell contains a unique symbol i, so every block occurs in just
one class C
i
. Thus we have a resolution. The converse is proved in the same
way.
The GAP [10] share package DESIGN [20] can nd all resolutions of a
block design, up to isomorphisms of the block design. In our case, isomor-
phisms of the block design come from symmetries of the partitioned grid, so
we can use this package to compute all gerechte designs up to permutation
of symbols and symmetries of the partitioned grid.
For example, the partition of the 5 5 grid discussed in the preceding
section is represented as a block design with 15 points and 25 blocks of size 3,
also shown in Figure 1. The automorphism group of the design is the cyclic
group of order 4 consisting of the rotations of the grid through multiples
of /2. The DESIGN program quickly nds that, up to automorphisms,
there are just six resolutions of this design, corresponding to six inequivalent
gerechte designs; these are shown in the gure.
The same method shows that, for a 6 6 square divided into 3 2
rectangles, there are 49 solutions up to symmetries of the corresponding
block design and permutations of the symbols. (The number of symmetries
of the block design in this case is 3456; the group consists of all row and
column permutations preserving the appropriate partitions.)
4
1.3 Orthogonal and multiple gerechte designs
We saw earlier the denition of orthogonality of Latin squares. A set of
mutually orthogonal Latin squares is a set of Latin squares in which every
pair is orthogonal. It is known that the size of a set of mutually orthogonal
Latin squares of order n is at most n 1.
Similar denitions and results apply to gerechte designs. We say that two
gerechte designs with the same partitioned grid are orthogonal to each other
if they are orthogonal as Latin squares, and a set of mutually orthogonal
gerechte designs is a set of such designs in which each pair is orthogonal.
Proposition 1.2 Given a partition of the nn grid into regions S
1
, . . . , S
n
each of size n, the size of a set of mutually orthogonal gerechte designs for
this partition is at most nd, where d is the maximum size of the intersection
of a region S
i
and a line (row or column) L
j
,= S
i
.
Proof Take a cell c L
j
S
i
. By permuting the symbols in each square,
we may assume that all the squares have entry 1 in the cell c. Now, in each
square, the symbol 1 occurs exactly once in the region S
i
and not in the
line L
j
; and all these occurrences must be in dierent cells, since for each
pair of squares, the pair (1, 1) of entries already occurs in cell c. So there are
at most [S
i
L
j
[ squares in the set.
This bound is not always attained. Consider the 5 5 gerechte designs
given earlier. The maximum intersection size of a line and a region is clearly 3,
so the bound for the number of mutually orthogonal designs is 2. But by
inspection, each design has the property that the entries in cells (2, 3) and
(3, 5) are equal. (The reader is invited to discover the simple argument to
show that this must be so, independent of the classication of the designs.)
Hence no pair of orthogonal designs is possible. Similarly, for the 66 square
divided into 3 2 rectangles, there cannot exist two orthogonal gerechte
designs, since it is well known that there cannot exist two orthogonal Latin
squares of order 6.
Proposition 1.2 gives an upper bound of 6 for the number of mutually
orthogonal Sudoku solutions. In Section 3.4, we will see that this bound is
attained.
The concept of a gerechte design can be generalized. Suppose that we are
given a set of r partitions of the cells of an n n grid into n regions each of
5
size n. A multiple gerechte design for this partition is a Latin square which
is simultaneously a gerechte design for all of the partitions.
For example, given a set of (mutually orthogonal) Latin squares, the
symbols in each square dene a partition of the n n array into regions. A
Latin square is a multiple gerechte design for all of these partitions if and
only if it is orthogonal to all the given Latin squares.
The problem of nding a multiple gerechte design can be cast into the
form of nding a resolution of a block design, in the same way as for a single
gerechte design. The block design has (r + 2)n points, and each cell of the
grid is represented by a block containing the objects indexing its row, its
column, and the region of each partition which contain it. Again, we can use
the DESIGN program to classify such designs up to symmetries of the grid.
For example, Federer [9], in a section which he attributed to G. M. Cox,
called a m
1
m
2
m
1
m
2
Latin square magic if it is a gerechte design for the
regions forming the obvious partition into m
1
m
2
rectangles, and super
magic if it is simultaneously a gerechte design for the partition into m
2
m
1
rectangles, where m
1
,= m
2
. He considered the problem of nding multiple
gerechte designs (which he called super magic Latin squares) for the 6 6
square partitioned into 3 2 rectangles and 2 3 rectangles. The DESIGN
package nds that there are 26 such designs up to symmetries.
We can also dene a set of mutually orthogonal multiple gerechte designs
in the obvious way, and prove a similar bound for the size of such a set.
We will see examples of these things in Section 3.4.
2 Statistical considerations
In this section, we consider the use of gerechte designs in statistical design
theory, and some additional properties which are important there.
2.1 Agricultural experiments in Latin squares
The statistician R. A. Fisher suggested the use of Latin squares in agricultural
experiments. If n treatments (crop varieties, quantities of fertilizer, etc.)
are to be compared on plots forming an n n grid in a eld, then arranging
the treatments as the symbols of a Latin square ensures that any systematic
change in fertility, drainage, etc. across the eld aects all treatments equally.
Figure 2 shows two experiments laid out as Latin squares.
6
Figure 2: Two experiments using Latin squares. Left: a 5 5 forestry
experiment in Beddgelert in Wales, to compare varieties of tree; designed by
Fisher, laid out in 1929, and photographed in about 1945. Right: a current
6 6 experiment to compare methods of controlling aphids; conducted by
Lesley Smart at Rothamsted Research, photographed in 2004.
If a Latin square experiment is to be conducted on land that has recently
been used for another Latin square experiment, it is sensible to regard the
previous treatments as relevant and so to use a Latin square orthogonal to
the previous one. As explained above, this is technically a sort of gerechte
design, but no agricultural statistician would call it that.
The purpose of a gerechte design in agricultural experimentation is to
ensure that all treatments are fairly exposed to any dierent conditions in
the eld. In fact, gerecht(e) is the German for fair in the sense of just.
Rows and columns are good for capturing dierences such as distance from
a wood but not for marking out stony patches or other features that tend to
clump in compact areas. Thus, in the statistical and agronomic literature,
the regions of a gerechte design are always taken to be spatially compact
areas.
7
2.2 Randomization
Before a design is used for an experiment, it is randomized. This means that
a permutation of the cells is chosen at random from among all those that
preserve the three partitions: into rows, into columns, and into regions. It is
by no means common for the cells to be actually square plots on the ground;
when they are, it is also possible to transpose rows and columns, if the regions
are unchanged by this action. This random permutation is applied to the
chosen gerechte design before it is laid out in the eld.
One important statistical principle is lack of bias. This means that every
plot in the eld should be equally likely to be matched, by the randomization,
to each abstract cell in the gerechte design, so that any individual plot with
strange characteristics is equally likely to aect any of the treatments. To
achieve this lack of bias, the set of permutations used for randomizing must
form a transitive group, in the sense that there is such a permutation carrying
any nominated cell to any other. The allowable permutations of the 5 5
grid in Figure 1 do not have this property, but those for magic Latin squares
do. There are others, but no complete classication as far as we know.
For the remainder of this section we assume that n = m
1
m
2
and the
regions are m
1
m
2
rectangles. Then the rows, columns and regions dene
some other areas: a large row is the smallest area that is simultaneously a
union of regions and a union of rows; a minirow is the non-empty intersection
of a row and region; large columns and minicolumns are dened similarly.
A pair of distinct cells in such a grid is in one of eight relationships,
illustrated in Figure 3 for the 6 6 grid with 3 2 regions. For i = 1,
. . . , 8, the cell labelled is in relationship i with the cell labelled i. Thus
a pair of distinct cells is in relationship 1 if they are in the same minirow;
relationship 2 if they are in the same minicolumn; relationship 3 if they are
in the same region but in dierent rows and columns; relationship 4 if they
are in the same row but in dierent regions; relationship 5 if they are in the
same column but in dierent regions; relationship 6 if they are in the same
large row but in dierent rows and regions; relationship 7 if they are in the
same large column but in dierent columns and regions; relationship 8 if they
are in dierent large rows and large columns.
The group of permutations used for randomization has the property that
a pair of distinct cells can be mapped to another pair by one of the permuta-
tions if and only if they are in the same relationship. If, in addition, we can
transpose the rows and columns (not possible in Figure 3) then relationships
8
1 4
2 6
3
5 8
7
Figure 3: Eight relationships between pairs of distinct cells in the 6 6 grid
1 and 2 are merged, as are 4 and 5, and 6 and 7.
The simple-minded analysis of data from an experiment in a gerechte
design assumes that the response (such as yield of grain, or the logarithm of
the number of aphids) on each cell is the sum of four unknown parameters,
one each for the row, column and region containing the cell, and one for
the treatment (symbol) applied to it. In addition, there is random variation
from cell to cell. This is explained in [2]. The statistician is interested in
the treatment parameters, not only in their values but also in whether their
dierences are greater than can be explained by cell-to-cell variation.
However, one school of statistical thought holds that if the innate dif-
ferences between rows, between columns and between regions are relevant,
then so potentially are those between minirows, minicolumns, large rows and
large columns. Yates took this view in his 1939 paper [24], whose discussion
of a 4 4 Latin square with balanced corners may be the rst published
reference to gerechte designs. Thus the eight relationships all have to be
considered when the gerechte design is chosen.
2.3 Orthogonality and the design key
Two further important statistical properties often conict with each other.
One is ease of analysis, which means not ease of performing arithmetic but
ease of explaining the results to a non-statistician. So-called orthogonal de-
signs, like the one in Figure 4, have this property.
A gerechte design with rectangular regions is orthogonal if the arrange-
ment of symbols in each region can be obtained from the arrangement in
any other region just by permuting minirows and minicolumns. In Figure 4,
9
5 2 6 3 4 1
6 3 4 1 5 2
4 1 5 2 6 3
2 5 3 6 1 4
3 6 1 4 2 5
1 4 2 5 3 6
Figure 4: An orthogonal design for the 6 6 grid with 3 2 regions
each minicolumn contains either treatments 1, 2 and 3 or treatments 4, 5
and 6. When the statisticain investigates whether there is any real dierence
between the average eects of these two sets of treatments, (s)he compares
their dierence (estimated from the data) with the underlying variability
between minicolumns within regions and columns (also estimated from the
data). Similarly, dierences between the average eects of the three sets of
two treatments 1, 4, 2, 5 and 3, 6 are compared with the variability of
minirows within regions and rows. Treatment dierences orthogonal to all of
those, such as the dierence between the average of 1, 5 and the average
of 2, 4, are compared with the residual variability between the cells after
allowing for the variability of all the partitions.
An orthogonal design for an m
1
m
2
m
1
m
2
square with m
1
m
2
regions
may be constructed using the design key method [21, 22], as recommended
in [3]. The large rows are labelled by A
1
, which takes values 1, . . . , m
2
.
Within each large row, the rows are labelled by A
2
, which takes values 1,
. . . , m
1
. Similarly, the large columns are labelled by B
1
, taking values 1,
. . . , m
1
, and the columns within each large column by B
2
, taking values 1,
. . . , m
2
. Then put N
1
= A
1
+B
2
modulo m
2
and N
2
= A
2
+B
1
modulo m
1
.
The ordered pairs of values of N
1
and N
2
give the m
1
m
2
symbols. In Figure 4,
the rows are numbered from top to bottom, the columns from left to right,
and the correspondence between the ordered pairs and the symbols is as
follows.
N
2
N
1
1 2 3
1 1 2 3
2 4 5 6
10
(When explaining this construction to non-mathematicians we usually take
the integers modulo m to be 1, . . . , m rather than 0, . . . , m1.)
Variations on this construction are possible, especially when m
1
and m
2
are both powers of the same prime p. For example, if m
1
= 4 and m
2
= 2
then we can work modulo 2, using A
1
to label the large rows, A
2
and A
3
to
label the rows within large rows, B
1
and B
2
to label the large columns, and
B
3
to label the columns within large columns. Numbers can be allocated by
putting N
1
= A
1
+B
3
, N
2
= A
2
+B
1
and N
3
= A
3
+B
2
. All that is required
is that no non-zero linear combination (modulo 2) of N
1
, N
2
and N
3
contains
only A
1
, B
1
and B
2
, or a subset thereof.
2.4 Eciency and concurrence
The other important statistical property is eciency, which means that the
estimators of the dierences between treatments should have small variance.
At one extreme, we might decide that the innate dierences between mini-
columns are so great that the design in Figure 4 provides no information at
all about the dierence between the average of treatments 1, 2, 3 and the
average of treatments 4, 5, 6; and similarly for minirows. In this case, it
can be shown (see [1, Chapter 7]) that the relevant variances can be deduced
from the matrix
M = m
1
m
2
I
1
m
2
R
1
m
1
C
+J.
Here I is the n n identity matrix and J is the n n all-1 matrix. The
concurrence of symbols i and j in minirows is the number of minirows con-
taining both i and j (which is n when i = j): the matrix
R
contains these
concurrences. The matrix
C
is dened similarly, using concurrences in mini-
columns. It is known that if the o-diagonal entries in the matrix M are all
equal then the average variance is as small as possible for the given values
of m
1
and m
2
, so the usual heuristic is to choose a design in which the o-
diagonal entries dier as little as possible. If m
1
= m
2
, this means that the
sums of the concurrences are as equal as possible. We explore this property
for Sudoku solutions in Section 4.1.
A compromise between these two statistical properties is general balance
[13, 16, 17], which requires that the concurrence matrices
R
and
C
com-
mute with each other. A special case of general balance is adjusted orthogo-
nality [8, 14], for which
R
C
= n
2
J. It can be shown that a gerechte design
11
with rectangular regions is orthogonal in the sense of Section 2.3 if it has
adjusted orthogonality and
2
R
= nm
2
R
and
2
C
= nm
1
C
. This property
is also explored further in Section 4.1.
3 Some special Sudoku solutions
Our main aim in this section is to consider some very special Sudoku solutions
which we call symmetric. We state our main results rst. The proofs will
take us on a tour through parts of nite geometry and coding theory; we
have included brief introductions to these topics, for readers unfamiliar with
them who want to follow us through the proofs of the theorems. Later in
the section, we show how to construct other Sudoku solutions having some
of the statistical properties introduced in the preceding section.
We have seen that a Sudoku solution is a gerechte design for the 9 9
array partitioned into nine 3 3 subsquares. To dene symmetric Sudoku
solutions, we need a few more types of region.
As dened in the last section, a minirow consists of three cells forming
a row of a subsquare, and a minicolumn consists of three cells forming a
column of a subsquare. We dene a broken row to be the union of three
minirows occurring in the same position in three subsquares in a column,
and a broken column to be the union of three minicolumns occurring in the
same position in three subsquares in a row. A location is a set of nine cells
occurring in a xed position in all of the subsquares (for example, the centre
cells of each subsquare).
Now a symmetric Sudoku solution is an arrangement of the symbols
1, . . . , 9 in a 9 9 grid in such a way that each symbol occurs once in each
row, column, subsquare, broken row, broken column, and location. In other
words, it is a multiple gerechte design for the partitions into subsquares,
broken rows, broken columns, and locations. Figure 5 shows a symmetric
Sudoku solution. The square shown has the further property that each of
the 3 3 subsquares is semi-magic, that is, its row and column sums (but
not necessarily its diagonal sums) are 15 (John Bray [6]).
As in the rst section, two Sudoku solutions are equivalent if one can be
obtained from the other by a combination of row and column permutations
(and possibly transposition) which preserve all the relevant partitions, and
re-numbering of the symbols.
The main result of this section asserts that, up to equivalence, there are
12
8 1 6 2 4 9 5 7 3
3 5 7 6 8 1 9 2 4
4 9 2 7 3 5 1 6 8
7 3 5 1 6 8 4 9 2
2 4 9 5 7 3 8 1 6
6 8 1 9 2 4 3 5 7
9 2 4 3 5 7 6 8 1
1 6 8 4 9 2 7 3 5
5 7 3 8 1 6 2 4 9
Figure 5: A semi-magic symmetric Sudoku solution
precisely two symmetric Sudoku solutions. This theorem can be proved by
a computation of the type described in the rst section. However, we give a
more conceptual proof, exploiting the links with the other topics of the title.
We also consider mutually orthogonal sets; we show that the maximum
number of mutually orthogonal Sudoku solutions is 6, and the maximum
number of mutually orthogonal symmetric Sudoku solutions is 4. Moreover,
there is a set of six mutually orthogonal Sudoku solutions of which four are
symmetric. These are exhibited in Figure 10.
Throughout this section will will use GF(3) to denote the nite eld with
three elements (the integers modulo 3).
3.1 Preliminaries
In this subsection we describe briey the notions of ane and projective
geometry and coding theory. Readers familiar with this material may skip
this subsection.
Ane geometry An ane space is just a vector space with the distin-
guished role of the origin removed. Its subspaces are the cosets of the vector
subspaces, that is, sets of the form U +v, where U is a vector subspace and
v a xed vector, the coset representative. This coset is also called the trans-
late of U by v. Two ane subspaces which are cosets of the same vector
subspace are said to be parallel, and the set of all cosets of a given vector
13
subspace forms a parallel class. A transversal for a parallel class of ane
subspaces is a set of coset representatives for the vector subspace.
We use the terms point, line and plane for ane subspaces of
dimension 0, 1, 2 respectively. We denote the n-dimensional ane space over
a eld F by AG(n, F); if [F[ = q, we write AG(n, q).
We will use the fact that a subset of AG(n, F) is an ane subspace if
(and only if) it contains the unique ane line through each pair of its points.
In ane space over the eld GF(3), a line has just three points, and the third
point on the line through p
1
and p
2
is the midpoint (p
1
+p
2
)/2 = (p
1
+p
2
).
Projective geometry Much of the argument in the proof of the main
theorem of this section will be an examination of collections of subspaces
of a vector space. This can also be cast into geometric language, that of
projective geometry.
The n-dimensional projective space over a eld F is the geometry whose
points, lines, planes, etc. are the 1-, 2-, 3-dimensional (and so on) subspaces
of an (n + 1)-dimensional space V (which we can take to be F
n+1
). A point
P lies on a line L if P L (as subspaces of F
n+1
).
For example, a point of the projective space PG(n, F) is a 1-dimensional
subspace of the vector space F
n+1
, and so it corresponds to a parallel class
of lines in the ane space AG(n + 1, F). The points of the projective space
can therefore be thought of as points at innity of the ane space.
We will mostly be concerned with 3-dimensional projective geometry; we
refer to [7, 12]. We will use the following notions:
Two lines are said to be skew if they are not coplanar. Skew lines
are necessarily disjoint. Conversely, since any two lines in a projective
plane intersect, disjoint lines are skew. So the terms disjoint and
skew for lines in projective space are synonyms. We will normally
refer to disjoint lines. Note that disjoint lines in PG(n, F) arise from
2-dimensional subspaces in F
n+1
meeting only in the origin.
A hyperbolic quadric is a set of points satisfying an equation like x
1
x
2
+
x
3
x
4
= 0. Any such quadric contains two rulings, each of which is
a set of pairwise disjoint lines covering all the points of the quadric
(Figure 6). Such a set of lines is called a regulus, and the other set is
the opposite regulus. Any three pairwise disjoint lines of the projective
space lie in a unique regulus. The lines of the opposite regulus are all
the lines meeting the given three lines (their common transversals).
14
A spread is a family of pairwise disjoint lines covering all the points
of the projective space. A spread is regular if it contains the regulus
through any three of its lines. (Any three lines of a spread are pairwise
disjoint, and so lie in a unique regulus.) It can be shown that, if the
eld F is nite, then there exists a regular spread. In particular, this
holds when F = GF(3).
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
f
f
f
f
f
f
f
f
f
f
f
f
f
f
f
f
f
f
f
f
f
f
f
f
f
f
f
f
f
f
e
e
e
e
e
e
e
e
e
e
e
e
e
e e
e
e
e
e
e
e
e
e
e
e
e
e
e
e e
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
e
e
e
e
e
e
e
e
e
e
e
e
e
e
7
7
7
7
7
7
7
7
7
7
7
7
7
7
e
e
e
e
e
e
e
e
e
e
e
e
e
e
7
7
7
7
7
7
7
7
7
7
7
7
7
7
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
p,qH
d(p, q) 9 8 3 = 216,
18
where d denotes Hamming distance. On the other hand, if we choose any
coordinate position (say the rst), and suppose that the number of vectors of
H having entries 0, 1, 2 there are respectively n
0
, n
1
, n
2
, then the contribution
of this coordinate to the above sum is
n
0
(9 n
0
) +n
1
(9 n
1
) +n
2
(9 n
2
) = 81 (n
2
0
+n
2
1
+n
2
3
) 81 27 = 54,
and so the entire sum is at most 4 54 = 216. So equality must hold, from
which we conclude that any pair of vectors have distance 3 (that is, agree in
one position).
Now take p, q H. Suppose, without loss, that they agree in the rst
coordinate; say p = (a, b
1
, c
1
, d
1
) and q = (a, b
2
, c
2
, d
2
). Since b
1
,= b
2
, the
remaining element of GF(3) is (b
1
+ b
2
). There is a unique element r of
H having rst coordinate a and second coordinate (b
1
+ b
2
); since it must
disagree with each of p and q in the third and fourth coordinates, it must
be r = (a, (b
1
+ b
2
), (c
1
+ c
2
), (d
1
+ d
2
)) = (p + q). This is the third
point on the ane line through p and q. So H is indeed an ane subspace,
as required (cf. p. 14).
Any translate of a perfect code is a perfect code; so any perfect code is
a coset of a vector subspace which is itself a perfect code. We call such a
subspace allowable. Our next task is to nd the allowable subspaces.
Lemma 3.3 The vectors p = (a
1
, a
2
, a
3
, a
4
), and q = (b
1
, b
2
, b
3
, b
4
) are two
linearly independent vectors in an allowable subspace X of V if and only if
the four ratios a
i
/b
i
, for i = 1, 2, 3, 4 are distinct, where 1/0 = is one
ratio that must appear, and the indeterminate form 0/0 does not appear.
Proof The vectors p, q and any two of the standard basis vectors, with just
one non-zero coordinate (equal to 1), must be linearly independent. So the
determinant of the corresponding matrix, which is a
i
b
j
a
j
b
i
, is not zero.
Then the result follows.
Given Lemma 3.3, we see that when a basis for an allowable subspace is
put into row-reduced echelon form, it takes one the following eight possibili-
ties.
_
_
_
_
_
_
1011
or
1022
_
_
_
and
_
_
_
0112
or
0121
_
_
_
_
_
_
or
_
_
_
_
_
_
1012
or
1021
_
_
_
and
_
_
_
0111
or
0122
_
_
_
_
_
_
(2)
19
These are the only allowable subspaces. So any perfect code in V is a coset
of one of those eight vector subspaces.
Our conclusion for symmetric Sudoku solutions so far can be summarised
as follows:
Any symmetric Sudoku solution is linear;
In a symmetric Sudoku solution, the positions of each symbol form a
coset of one of the eight subspaces given above.
Next we come to the question of how such subsets can partition V . One
simple way is just to take all cosets of one of the above 2-dimensional vector
subspaces; this gives the solutions we described above as Type A. Another
choice is the following. Extend an allowable subspace X to an appropriate
3-dimensional vector subspace Y of V . The three cosets of Y partition V ,
and we can look for another allowable subspace X
are the
only two allowable subspaces parallel to any set in the partition for a Sudoku
solution. Furthermore, in each of the three cosets of Y , cosets of only one of
X or X
containing L
1
and
23
L
2
and having L
5
and L
6
in the opposite regulus; the other two lines of 1
can be added to the four lines arising from the Hamming codes to produce
the required set of six lines. They have equations x
1
+x
4
= x
2
+x
3
= 0 and
x
1
x
4
= x
2
x
3
= 0. See Figure 9. The resulting six mutually orthogonal
Sudoku solutions are shown in Figure 10; the last four are symmetric.
It can be shown that the four lines H
1
, . . . , H
4
disjoint from the two reguli
themselves form a regulus; they and the lines of the opposite regulus are the
eight Hamming codes.
r
r
r
r
r
L
1
L
2
L
3
L
4
L
5
L
6
H
1
H
2
H
3
H
4
R
..
. .
R
C
= 81J, where
J is the all-one matrix. (Here the (i, j) entry of
R
counts the number of
minirows in which i and j both occur, and similarly for
C
.) The special
Sudoku solution of Figure 5 has this property, but it is not unique. (In this
solution, all entries of each concurrence matrix are 0 or 9.) We found that
there are, up to symmetry, 194 Sudoku solutions for which the minirows and
minicolumns have adjusted orthogonality in this sense, of which 104 have the
property that both
R
and
C
have entries dierent from 0 and 9. One of
these solutions is shown in Figure 12.
1 2 3 4 5 6 7 8 9
7 8 9 1 3 2 6 5 4
4 5 6 7 8 9 1 3 2
3 1 2 6 4 5 9 7 8
9 7 8 2 1 3 4 6 5
6 4 5 9 7 8 2 1 3
8 9 1 5 6 4 3 2 7
2 3 7 8 9 1 5 4 6
5 6 4 3 2 7 8 9 1
Figure 12: Minirows and minicolumns form designs with adjusted orthogo-
nality, but the overall design is not orthogonal
A word about the computations reported in this section. The strategy
is to place the symbols 1, . . . , 9 in the grid successively to satisfy the con-
straints. The positions of a single symbol in the grid subject to the Sudoku
constraints that it occurs once in each row, column and subsquare can be
described by a permutation of the set 1, . . . , 9, where the set of positions
27
is (i, (i)) : 1 i 9. There are 6
6
of these Sudoku permutations.
We say that two Sudoku permutations are compatible if they place their
symbols in disjoint cells satisfying the appropriate conditions (for example,
for concurrences 4 and 5, that there are either 4 or 5 occurrences of the two
symbols in the same minirow or minicolumn). Then we form a graph as
follows: the vertex set is the set of all Sudoku permutations, and we join two
vertices if they are compatible. We now search randomly for a clique of size 9
in the compatibility graph: this is a set of nine mutually compatible Sudoku
permutations, dening a Sudoku solution with the required properties.
Adjusted orthogonality of the two designs is not captured by any obvious
compatibility condition on the Sudoku permutations, and we proceeded dif-
ferently. Since each of the two concurrence matrices has diagonal entries 9, we
see that adjusted orthogonality implies that two symbols cannot occur both
in the same minirow and in the same minicolumn. Using this as the com-
patibility condition, we built the compatibility graph, and found all cliques
of size 9, using the GAP package GRAPE [19]. Remarkably, it turned out
that all of them actually give designs with adjusted orthogonality; we know
no simple reason for this fact, since our compatibility condition appears not
strong enough to guarantee this.
4.2 Other nite eld constructions
The construction in Section 3.4 can be generalized.
Proposition 4.1 Let q be a prime power, and a and b positive integers. Let
n = q
a+b
. Partition the n n square into q
a
q
b
rectangles. Then we can
nd
q
a+b
1
(q
a
1)(q
b
1)
q 1
.
mutually orthogonal gerechte designs for this partitioned grid.
Remark If a < b, our upper bound for the number of mutually orthogonal
gerechte designs for this grid is q
b
(q
a
1). If a = 1, this bound is equal to
the number in the theorem, so our bound is attained. If a > 1, however,
the bound is not met by the construction. For example, if p = 2, a = 2 and
b = 3, the bound is 24 but the construction achieves 10. If a and b are not
coprime, we can improve the construction by replacing q, a, b by q
d
, a/d, b/d,
where d = gcd(a, b).
28
Proof Represent the cells by points of the ane space AG(2(a+b), q) with
coordinates x
1
, . . . , x
a+b
, y
1
, . . . , y
a+b
. The rows are cosets of the subspace
x
1
= = x
a+b
= 0, the columns are cosets of the subspace y
1
= =
y
a+b
= 0, and the rectangles are cosets of x
1
= = x
a
= y
1
= = y
b
= 0.
As before, we work in the projective space PG(2(a +b) 1, q). The rst
two subspaces are disjoint, and are part of a spread of q
a+b
1 subspaces of
the same dimension. The third subspace meets the rst in (q
b
1)/(q 1)
points and the second in (q
a
1)/(q1) points, and has (q
a
1)(q
b
1)/(q1)
further points. In the worst case, this subspace meets (q
a
1)(q
b
1)/(q 1)
further spaces of the spread, each in one point. This leaves q
a+b
1 (q
a
1)(q
b
1)/(q 1) spread spaces disjoint from it, as required.
Our construction of mutually orthogonal symmetric Sudoku solutions also
generalizes:
Proposition 4.2 Let q be a prime power, and consider the q
2
q
2
grid,
partitioned into q q subsquares, broken rows, broken columns, and locations
as in the preceding section. Then there exist (q 1)
2
mutually orthogonal
multiple gerechte design for these partitions; this is best possible.
Proof We follow the same method as before, working over GF(q). The lines
of PG(3, q) dening rows, columns, subsquares, broken rows, broken columns,
and locations lie in the union of two reguli with two common lines, which
form part of a regular spread. The remaining (q 1)
2
lines of the spread give
the required designs. The upper bound is proved as before.
Acknowledgements
The left-hand photograph in Figure 2 appears in [5], reproduced by permis-
sion of the Forestry Commission. It can also be found on the web at [15]. We
thank Lesley Smart for permission to use the right-hand photograph, which
was taken by Neil Mason of the Plant and Invertebrate Ecology Division of
Rothamsted Research.
References
[1] R. A. Bailey, Association Schemes: Designed Experiments, Algebra and
Combinatorics, Cambridge Studies in Advanced Mathematics 84, Cam-
bridge University Press, Cambridge, 2004.
29
[2] R. A. Bailey, J. Kunert and R. J. Martin, Some comments on gerechte
designs. I. Analysis for uncorrelated errors. J. Agronomy & Crop Science
165 (1990), 121130.
[3] R. A. Bailey, J. Kunert and R. J. Martin, Some comments on gerechte
designs. II. Randomization analysis, and other methods that allow for
inter-plot dependence, J. Agronomy & Crop Science 166 (1991), 101
111.
[4] W. U. Behrens, Feldversuchsanordnungen mit verbessertem Ausgleich
der Bodenunterschiede, Zeitschrift f ur Landwirtschaftliches Versuchs-
und Untersuchungswesen 2 (1956), 176193.
[5] J. F. Box, R. A. Fisher: The Life of a Scientist. John Wiley & Sons,
New York, 1978.
[6] J. N. Bray, personal communication, February 2006.
[7] P. J. Cameron, Projective and Polar Spaces, QMW Maths Notes 13,
Queen Mary and Westeld College, London, 1991; available from
http://www.maths.qmul.ac.uk/
~
pjc/pps/
[8] J. A. Eccleston and K. G. Russell, Connectedness and orthogonality in
multi-factor designs, Biometrika 62 (1975), 341345.
[9] W. T. Federer, Experimental DesignTheory and Applications, Macmil-
lan, New York, 1955.
[10] The GAP Group, GAP Groups, Algorithms, and Programming, Ver-
sion 4.6; Aachen, St Andrews, 2005, http://www.gap-system.org/
[11] R. Hill, A First Course in Coding Theory, Clarendon Press, Oxford,
1986.
[12] J. W. P. Hirschfeld, Finite Projective Spaces of Three Dimensions, Ox-
ford University Press, Oxford, 1985.
[13] A. M. Houtman and T. P. Speed, Balance in designed experiments with
orthogonal block structure, Ann. Statist. 11 (1983) 10691085.
[14] S. M. Lewis and A. M. Dean, On general balance in rowcolumn designs,
Biometrika 78 (1991), 595600.
30
[15] Materials for the History of Statistics.
URL: http://www.york.ac.uk/depts/maths/histstat/
[16] J. A. Nelder, The analysis of randomized experiments with orthogo-
nal block structure. II. Treatment structure and the general analysis of
variance, Proc. Roy. Soc. London A 283 (1965) 163178.
[17] J. A. Nelder, The combination of information in generally balanced de-
signs, J. Roy. Statistic. Soc. B 30 (1968), 303311.
[18] Ed Russell and Frazer Jarvis, There are 5472730538 essentially dierent
Sudoku grids,
http://www.afjarvis.staff.shef.ac.uk/sudoku/sudgroup.html
[19] L. H. Soicher, GRAPE: a system for computing with graphs and groups.
In L. Finkelstein and W. M. Kantor, editors, Groups and Computation,
volume 11 of DIMACS Series in Discrete Mathematics and Theoreti-
cal Computer Science, pages 287291. American Mathematical Society,
1993. GRAPE homepage:
http://www.maths.qmul.ac.uk/
~
leonard/grape/
[20] Leonard H. Soicher, The DESIGN package for GAP,
http://designtheory.org/software/gap design/
[21] H. D. Patterson, Generation of factorial designs, J. Roy. Statist. Soc. B
38 (1976), 175179.
[22] H. D. Patterson and R. A. Bailey, Design keys for factorial experiments,
Applied Statistics 27 (1978), 335343.
[23] Emil Vaughan, personal communication, November 2005.
[24] F. Yates, The comparative advantages of systematic and randomized
arrangements in the design of agricultural and biological experiments,
Biometrika 30 (1939), 440466.
31