The Mathematics of Decisions, Elections, and Games
The Mathematics of Decisions, Elections, and Games
The Mathematics of Decisions, Elections, and Games
Karl-Dieter Crisman
Michael A. Jones
Editors
Karl-Dieter Crisman
Michael A. Jones
Editors
624
Karl-Dieter Crisman
Michael A. Jones
Editors
2010 Mathematics Subject Classification. Primary 91-06, 91A05, 91A12, 91A20, 91B06,
91B08, 91B12, 91B14, 91B32, 91F10.
Copying and reprinting. Material in this book may be reproduced by any means for edu-
cational and scientific purposes without fee or permission with the exception of reproduction by
services that collect fees for delivery of documents and provided that the customary acknowledg-
ment of the source is given. This consent does not extend to other kinds of copying for general
distribution, for advertising or promotional purposes, or for resale. Requests for permission for
commercial use of material should be addressed to the Acquisitions Department, American Math-
ematical Society, 201 Charles Street, Providence, Rhode Island 02904-2294, USA. Requests can
also be made by e-mail to [email protected].
Excluded from these provisions is material in articles for which the author holds copyright. In
such cases, requests for permission to use or reprint should be addressed directly to the author(s).
(Copyright ownership is indicated in the notice in the lower right-hand corner of the first page of
each article.)
c 2014 by the American Mathematical Society. All rights reserved.
The American Mathematical Society retains all rights
except those granted to the United States Government.
Copyright of individual articles may revert to the public domain 28 years
after publication. Contact the AMS for copyright status of individual articles.
Printed in the United States of America.
∞ The paper used in this book is acid-free and falls within the guidelines
established to ensure permanence and durability.
Visit the AMS home page at http://www.ams.org/
10 9 8 7 6 5 4 3 2 1 19 18 17 16 15 14
Contents
Preface vii
Redistricting and District Compactness
Carl Corcoran and Karen Saxe 1
Fair Division and Redistricting
Zeph Landau and Francis Edward Su 17
When Does Approval Voting Make the “Right Choices”?
Steven J. Brams and D. Marc Kilgour 37
How Indeterminate is Sequential Majority Voting? A Judgement Aggregation
Perspective
Klaus Nehring and Marcus Pivato 55
Weighted Voting, Threshold Functions, and Zonotopes
Catherine Stenson 89
The Borda Count, the Kemeny Rule, and the Permutahedron
Karl-Dieter Crisman 101
Double-Interval Societies
Maria Margaret Klawe, Kathryn L. Nyman, Jacob N. Scott,
and Francis Edward Su 135
Voting for Committees in Agreeable Societies
Matt Davis, Michael E. Orrison, and Francis Edward Su 147
Selecting Diverse Committees with Candidates from Multiple Categories
Thomas C. Ratliff 159
Expanding the Robinson-Goforth System for 2 × 2 Games
Brian Hopkins 177
Cooperation in n-Player Repeated Games
Daniel T. Jessie and Donald G. Saari 189
The Dynamics of Consistent Bankruptcy Rules
Michael A. Jones and Jennifer M. Wilson 207
v
Preface
The majority of papers in this collection accompanied talks from the AMS Spe-
cial Sessions on the Mathematics of Decisions, Elections, and Games from the 2012
and 2013 Joint Mathematics Meetings of the AMS and MAA. These sessions were
organized by Karl-Dieter Crisman (Gordon College), Michael A. Jones (Mathemat-
ical Reviews), and Michael Orrison (Harvey Mudd College). The one exception
is the paper based on a talk from the AMS Special Session on The Redistricting
Problem; this session was organized by Daniel Goroff (Harvey Mudd College) and
Daniel Ullman (George Washington University) for the 2009 Joint Mathematics
Meetings. The full programs for the 2012 and 2013 AMS Special Sessions on the
Mathematics of Decisions, Elections, and Games can be found online by searching
jointmathematicsmeetings.org.
Decision theory, voting theory, and game theory are three intertwined areas
of mathematics that involve making optimal decisions under different contexts.
Although these areas consist of their own mathematical results, much of the recent
research in these areas involve developing and applying new perspectives from their
intersection with other branches of mathematics, such as algebra, representation
theory, combinatorics, convex geometry, dynamical systems, etc. The papers in
this volume highlight and exploit the mathematical structure of decisions, elections,
and games to model and to analyze problems from the social sciences.
In what follows we give a short overview of the papers in this collection. To
those new to the area, we wish to emphasize that many different types of mathe-
matics can be profitably used in this interdisciplinary context.
Both Redistricting and District Compactness by Corcoran and Saxe and Fair
Division and Redistricting by Landau and Su focus on the redistricting problem:
carving up a state into congressional districts. The former discusses different mea-
sures used to evaluate proposed districting plans. These measures take into account
the geometric shape of both the state and the districts as well as the spread of the
population throughout the district. The latter paper offers a different perspective
on redistricting by viewing it as a fair division or cake-cutting problem in which
the state is viewed as a cake to be divided into districts that are allocated to one
of two political parties.
The next two papers fall under the topic of judgment aggregation, which may
be viewed as being at the intersection of social choice theory and decision theory.
While social choice theory is concerned with aggregating preferences of voters to
arrive at a societal ranking, judgment aggregation is a burgeoning area focusing
on aggregating individuals’ judgments on interconnected propositions to arrive at
a collective judgment.
vii
viii PREFACE
In When Does Approval Voting Make the “Right Choices”?, Brams and Kilgour
examine different contexts in which an individual voter approves of a proposal based
on the proposal’s probability of being right (or good or just) and the voter’s proba-
bility of making a correct judgment of whether it is right (or wrong). For multiple
proposals in which more than one proposal can be approved and under probabilis-
tic settings, they determine conditions for when the most approved proposals have
the greatest probability of being right, relating their results to the Condorcet Jury
Theorem.
Nehring and Pivato take a more axiomatic approach to judgment aggregation
in How Indeterminate is Sequential Majority Voting? A Judgement Aggregation
Perspective. Similar to other choice problems, the order in which decisions are made
can matter. This paper continues a research program exploring in exactly what
ways the order in which judgments are made can affect the actual final judgments
on various propositions, in this case showing that a very large number of natural
examples exhibit various forms of this indeterminacy in sequential majority votes.
Several papers continue a long tradition in the field of using geometry and
combinatorial objects to analyze a wide variety of voting questions. Stenson, in
Weighted Voting, Threshold Functions, and Zonotopes, looks at a generalization of
simple games and weighted voting systems. She explains how these games, and
their winning/losing coalitions, correspond to vertices and other geometric aspects
of particular polytopes and their duals. The Borda Count, the Kemeny Rule, and
the Permutahedron by Crisman takes the geometric-combinatorial object called the
permutahedron, and examines its symmetries with the goal of understanding social
preference functions, which return not just winners but (sets of) rankings. The
main result uses representation theory to characterize the most symmetric social
preference functions as a one-dimensional continuum connecting the two rules in
the title.
In Double-Interval Societies, Klawe, Nyman, Scott, and Su consider the effect
of geometric constraints in a model in which voters approve of positions in a linear
spectrum. When each voter’s approval set is represented by two disjoint closed
intervals and when every pair of voters agree on some position, the authors deter-
mine a lower bound for the approval ratio. They also construct societies with low
approval ratios by relating the double-interval models to the arrangement of n sym-
bols in which each symbol appears twice, thereby relating the continuous geometry
to discrete sequences.
By contrast, Voting for Committees in Agreeable Societies by Davis, Orrison,
and Su focuses more on combinatorial aspects of agreeability, supposing that voters
each approve of a certain size subset of candidates with a goal of selecting a com-
mittee of a possibly different size. Using a graph called the Johnson graph, they
prove a number of results regarding what proportion of voters will be satisfied with
the results under different suppositions about the distribution of voters’ intent.
Ratliff’s paper Selecting Diverse Committees with Candidates from Multiple
Categories approaches committees from a different perspective. His focus is on
selecting committees where candidates may be in several categories, and where it is
desirable to have a committee with members from each of these categories. Using
a combination of explicit examples and point-counting arguments with Ehrhart
polynomials, both probabilistic and exact results regarding admissible ballot types
to obtain these outcomes are derived.
PREFACE ix
Finally, we have three papers examining game theory from different mathe-
matical perspectives. In Expanding the Robinson-Goforth System for 2 × 2 Games,
Hopkins considers the relationships between bimatrix games in which outcomes are
ordinally ranked, but there may be ties between outcomes. Robinson and Goforth
developed a group-theoretic classification for the no-tie case; Hopkins revisits this
work from a graph-theoretic perspective and shows that the edges of an associated
graph yield information for when there is a single tie. He expands the Robinson
and Goforth system using a collection of simplices that includes all 2 × 2 ordinal
rank games with ties.
Although the decomposition of Jessie and Saari in Cooperation in n-Player
Repeated Games may be viewed as a type of classification, it is more of a tool to
analyze behavior in games. For two-strategy, n-player games, they decompose the
games’ n-matrix payoffs into behavioral and strategic components. For a specific
game in a repeated setting, the strategic component captures the information nec-
essary to determine Nash equilibrium behavior, while the behavioral component
describes a type of cooperative behavior. This approach lends itself to the descrip-
tion of all games with any specific type of behavior, and how this behavior changes
as the components change.
In a bankruptcy problem, a set of individuals have claims that when summed
exceed the amount of an estate. A bankruptcy rule determines how to share the
estate among the claimants. Motivated by the relationship between the two-player
Contested Garment rule and the n-player Talmud rule, Jones and Wilson (The
Dynamics of Consistent Bankruptcy Rules) define a dynamic averaging process in
which a k-player rule is used on all subsets of size k, the outcomes are averaged,
and the processed is repeated. They show that when the k-player rule satisfies
a well-studied notion of consistency, then the dynamic process converges to the
n-player solution for any initial allocation.
In conclusion, as editors we want to thank all the authors for their interesting
and strong papers. We enjoyed reading them, and hope that readers will experience
that same pleasure in exploring the intersection between mathematics and the social
sciences present in this volume.
Contemporary Mathematics
Volume 624, 2014
http://dx.doi.org/10.1090/conm/624/12476
2014
c American Mathematical Society
1
2 CARL CORCORAN AND KAREN SAXE
As one might expect, redistricting can profoundly impact the political land-
scape of the state. With current methods and computational power, mapmakers
can determine with a great degree of accuracy the political leaning of a given dis-
trict, based on demographic data and election returns. This makes the redistricting
process ripe for politically motivated manipulation. In most states, the power of
district drawing is given to state legislators [11]. Each legislator naturally has an
interest in how the districts are drawn, either to protect him or herself in state
level elections, or to curry favor with his or her party. When this power is abused
to give an advantage to one party or group, gerrymandering has occurred. Dis-
tricts that re-elect incumbents who are protected by gerrymandering could well be
districts that lean to partisan extremes (and therefore the individuals nominated
could be left-leaning Democrats or right-leaning Republicans); this phenomenon
could be resulting in our recent observed and increasing partisanship in the House
of Representatives [3].
Scholars disagree on how precisely to define gerrymandering. Richard Mor-
rill defines it as “the intentional manipulation of territory toward some desired
electoral outcome,” [13], while to Michael McDonald and Richard Engstrom it is
“the drawing of electoral districts so as to assign unequal voting weights to cog-
nizable political groups.” [12]. These definitions underscore two critical facets of
gerrymandering: the identification of communities of interest, and the (intended)
consequences of the gerrymander. Taken together, they are a sufficient definition
of gerrymandering.
Some scholars defend the gerrymander as an essential part of the redistricting
machinery, but most identify its disenfranchising effects as an evil. Robert Stern
identifies expressive harms caused by gerrymandering: it diminishes effective rep-
resentation by decreasing the number of competitive districts, and it minimizes
the need for coalition-building which would allow small, single-issue groups to be
heard [21]. We agree with Stern, that gerrymanders do indeed threaten the ef-
fective representation that we strive for in the United States. How, then, are we
to determine when a district has been gerrymandered? Furthermore, how do we
prevent gerrymanders from being made in the first place?
The Supreme Court again has opined on the subject. In the 1982 case Karcher
v. Daggett, it set forth criteria which were crystalized in 1993 with Shaw v. Reno.
REDISTRICTING AND DISTRICT COMPACTNESS 3
These standards are known as “traditional districting principles,” and include, but
are not limited to, equal population, compactness, contiguity, integrity of commu-
nities of interest, integrity of political subdivisions, and the integrity of natural
boundaries. Extremely non-compact districts are generally taken to be gerryman-
ders. For example, on its face, Maryland’s 3rd district is definitely not compact;
incidentally, it is also not contiguous (see Figure 1).1
This paper examines the notion of compactness, which in essence is a measure
of how spread out the parts of a district are. Many ingredients can go into a com-
pactness measure, including perimeter, population dispersion, and area dispersion.
Section 2 includes a very brief survey of traditional compactness measures together
with an exposition of a newer class of measures. Section 3 assembles criteria for a
quality compactness measure, offers a discussion of the quality of various measures
according to these criteria, and reflects on the utility of compactness measures in
practice.
2. Measuring Compactness
2.1. Traditional Compactness Measures. In the past few decades, the use
of computers in redistricting has exploded, to the point where every state now uses
some form of software to make their maps [12]. Built into many of these software
packages are several measures of compactness. This accessibility, combined with
ever-increasing computing power, has made compactness a routine evaluation when
creating a districting plan. However, there remains the question of which measures
to use, and which measures are the ‘best.’ There is and can be no consensus on
this point, as every measure has both strengths and limitations.
In previous surveys of compactness measures (for example, see Young 1988,
Niemi et al. 1991), the authors come to this same conclusion. Instead of prescribing
a specific measure to use, they explore the benefits and detriments of several. Niemi
et al. classify these measures into three types: perimeter measures, area dispersion
measures, and population dispersion measures. We now present a simple example
of a measure from each of these three categories. The first measure is a perimeter
measure, the second is an area dispersion measure, and the third is a population
dispersion measure. We choose the three as they are conceptually simple, and were
all three used in our state of Minnesota during the 2011-12 round of redistricting.
2.1.1. A Perimeter Measure: Polsby-Popper. Polsby and Popper [16] intro-
duced their measure of compactness as a variation on Schwartzberg’s measure [20].
They write, “[t]he absolute measure of a shape’s efficiency is determined by dividing
the area of the shape by the area of a circle with perimeter of equal length.” The
Polsby-Popper score is thus given by
4πA(D)
P P (D) = ,
p2
where D is a district, p is the length of the perimeter of the district, and A(D) is
the area of the district. Though it is not how it was originally defined, it turns out
that Schwartzberg’s score is the square root of the reciprocal of this.2 We choose
to focus on Polsby-Popper since its scores are always in the interval [0, 1].
1 http://www.planning.maryland.gov/PDF/Redistricting/2010maps/Cong/Dist_3
2 It is defined as the ratio of the perimeter of a district to the perimeter of a circle with equal
area.
4 CARL CORCORAN AND KAREN SAXE
2.1.2. An Area Dispersion Measure: Reock. The Reock test [17] amounts to
finding the smallest circle containing the district and taking the ratio of the dis-
trict’s area to the area of the circle. We let D ⊂ R2 be a district, and again
let A(D) be the area of the district. Then, let SC be the set of all circles in R2
that completely contain the district D. Define the circle Cmin to be such that
A(Cmin ) = inf{A(C)|C ∈ SC }, where A(C) is taken to be the area of the region
enclosed by the circle C. The value of the Reock score of a district D is then given
by
A(D)
Reock(D) = .
A(Cmin )
If the district is a union of islands, smallest circles are found for each island.[2]
Both the Polsby-Popper and Reock measures have values between 0 and 1,
with scores closer to 1 indicating more compact districts. This makes the scores
readily comparable across plans. However, they both presume the circle to be the
most compact shape. While this makes sense geometrically, it doesn’t so much in
the context of redistricting. No state can possibly be covered with a finite number
of non-overlapping circular districts. Moreover, districts that meander within a
vaguely circular shape will score highly with Reock, even if they are facially quite
non-compact (see Figure 2).
2.1.3. A Population Dispersion Measure: Population Circle/Convex Hull. The
population circle measure is the ratio of the district population to the population
contained in the smallest circle containing the district (defined the same way as
with the Reock test). The convex hull measure simply replaces the smallest circle
encompassing the district with the convex hull of the district. The convex hull
measure is referred to as the “rubber band” measure and was first discussed in [9].3
These two measures are much better suited to detecting urban gerrymanders,
versus rural. For instance, in a region with low population density, except for small
pockets of dense habitation, a gerrymander that connects these pockets will score
highly by this measure, as few people live in the areas cut out by the gerrymander.
Moreover, the meandering circle district of Figure 2 will score a perfect 1 if the
population is completely contained in the shaded area, and will even score highly
if the population is merely concentrated in the shaded area.
3 Existing Mathematica code is written for several computations of this paper. For ex-
where dvu is the Euclidean distance function between the locations of voters v and
u. Next, let SD ∗ be the districting plan that minimizes π. The relative proximity
index is then defined as
π(SD )
RP I(SD ) = .
π(SD ∗ )
6 CARL CORCORAN AND KAREN SAXE
X
6 i
4 d (X ,X ) d (X ,X )
D i j S i j
2
X
j
2 4 6 8
Note that the value of this measure ranges from 1 to infinity, with higher scores
indicating more severe non-compactness. Fryer and Holden also provide a method,
based on Voronoi diagrams, for finding SD ∗ .
The contiguity requirement in redistricting ensures that every district will be
path connected. However, when the nature of a given path is considered, there is
ample room to make judgements about the irregularity of a district. Papayanopou-
los and Fryer and Holden use the straight line distance between, respectively, ge-
ographic centers and individual voters, even if this straight line path does not lie
entirely within the district under consideration. This observation becomes a priority
in our further discussion of path-based measures.
To establish notation, consider a geographic legislative district D in a state S,
made up of k census blocks and with a total population of P. We take bi to be
the ith census block, with population pi and geometric center xi , for each integer i
between 1 and k. The distance between xi and xj within the district D is denoted
by dD (xi , xj ).
Recall that Maryland’s panhandle involves a jagged coastline. Any district in
that panhandle must also be bounded by the coastline. This may exaggerate the
distance dD (xi , xj ), when in reality the mapmaker is given no better option.4 To
account for this lack of choice, we also consider dS (xi , xj ), the distance between xi
and xj within the state. For example, consider a “C” shaped district composed of
square census blocks. Figure 3 gives an example of the distances that dD and dS
describe, indicating that the former might be significantly greater than the latter.
Here, the light blue (lighter gray if reading the black and white version of this paper)
region represents the district D, and the gray (darker gray) region represents the
state S.
We are then concerned with the ratio of these two distances:
dS (xi , xj )
.
dD (xi , xj )
Note that the value of this ratio is bounded above by 1. We use this ratio as a
weight for the sum of the ith and j th populations. Then, we sum these pairwise
4 District boundaries are regularly drawn as straight lines through bodies of water specif-
A question becomes apparent at this point: What is to be done when bi and bj are
the same census block? When this occurs, the distance ratio is not defined. We
define it to be 1, and explain this choice after discussion of Chambers and Miller’s
class of measures. d (x ,x )
Looking again to bi ∈D bj ∈D dDS (xii ,xjj ) (pi + pj ), the maximum value this sum
can take is 2kP, where k is the number of census blocks in D. That is to say, if we
d (x ,x )
have a convex figure, dDS (xii,xjj) = 1 for all i and j. It follows that
dS (xi , xj )
(pi +pj ) ≤ (pi +pj ) = (kpi +P ) = kP +kP = 2kP.
dD (xi , xj )
bi ∈D bj ∈D bi ∈D bj ∈D bi ∈D
Therefore, we can force the value of the expression between 0 and 1 by multiplying
by (2kP )−1 . This gives us our measure of compactness, which we will call C:
1 dS (xi , xj )
C(D) = (pi + pj ),
2kP dD (xi , xj )
bi ∈D bj ∈D
Chambers and Miller do not address the question of what is to be done when
bi and bj are the same census block and the ratio becomes indeterminant. But, we
can interpret what the necessary definition is. One obvious candidate is to define
the ratio to be zero. However, this would imply the Chambers and Miller measure
varies between 0 and some rational constant less than 1. Suppose we define ddDS (xi ,xi )
(xi ,xi )
to be 0. We observe that for a convex district D, their discrete measure s1 (D) is,
then,
⎡ ⎤−1 ⎡ ⎤
1 2
⎣ pi pj ⎦ ⎣ pi pj − pi 2 ⎦ = 1 − 2 pi .
P
bi ∈D bj ∈D bi ∈D bj ∈D bi ∈D bi ∈D
Furthermore, this value changes given the populations of each bi , as well as the
number of census blocks in the district. Thus, the upper bound on the measure
would vary for different districts. This would certainly be an undesirable feature
of a compactness metric, so we can conclude that defining the distance ratio to be
0 when i = j is unreasonable. Instead, it is preferable to define it to be 1, as this
would place the value of Chambers and Miller’s discrete measure in the interval
[0, 1], and take the value 1 for a convex D. We choose to do the same for our
measure.
While Chambers and Miller define the class of measures for parameter q ≥ 0,
they only give further discussion of the case q = ∞. To emphasize, the special
case q = ∞ includes a non-zero contribution for xi and xj in the sum if and
only if the shortest path between the two points actually lies in the district. Also
note that if either block bi or bj has no one living in it, then the corresponding
term gives no contribution since it is the product of the populations that is used
(this is independent of the value of q ≥ 0). One may think this is irrelevant
but, in fact, a significant proportion of census blocks have population zero. In
Minnesota, the authors’ home state, the 2010 census recorded 107,794 of the 259,778
blocks as having zero population.5 Despite the similarities between our measure and
Chambers and Miller’s q = 1 measure, the scores diverge for at least one important
type of example, as will be described in the next section.
Finally, Hodge, et al. take the notion of path-based compactness in a different
direction. They develop a convexity coefficient measure that instead of looking at
pairwise distances, looks at the “visible” area from a given point [8]. They define
the “visible region” VD (x, y) of a point (x, y) in a district D to be the set of all
points (x , y ) in D such that the shortest straight line path between (x, y) and
(x , y ) is wholly contained in D. Then, the fraction of the district’s area that this
visible region encompasses is computed for each (x, y), and summed, yielding the
5 http://www.gis.leg.mn/html/download.html
REDISTRICTING AND DISTRICT COMPACTNESS 9
double integral
A(VD (x, y))
χ(D) = dxdy,
D D (A(D))2
where A(VD (x, y)) is the area of the visible region of (x, y), and A(D) is the area
of the district. For each of the 435 U.S. congressional districts, they estimate the
value of this convexity coefficient using Monte Carlo methods to approximate the
double integral.
As thus defined, this measure does not factor in the population dispersion of a
given district, nor does it take into account irregular state borders. They modify
this measure in two ways, first to account for irregular state borders, and then
to account for population distribution within the district. In order to account for
the population distribution, they replace random points with points selected to
represent random census blocks. We are not confident about the full implications
of this choice, but it can be said that in Minnesota (and many other states), a
random selection of census blocks would result in a large number of blocks chosen
with no inhabitants. In Minnesota, between one-third and one-half of the blocks
chosen using their method would have no inhabitants.
2.3. Examples. The aim of developing path-based population hybrid mea-
sures is to take into account both the state boundary and also distribution of pop-
ulation within the district. Hence, we now look at how the measures are affected
by changes in either the population of the census blocks, or the border of the state.
We use hypothetical “C”-shaped districts to illustrate these changes (see Figures
5 and 6). While this “C” shape is a model, Figure 4 shows that there are in fact
districts that resemble our hypothetical district.6
In each model “C”-shaped district, the light blue (lightest gray if reading the
black and white version of this paper) area is the “district,” and the gray (medium
gray) area is the “state.” In each example, the district is broken into square census
blocks, and the population labels each block. For convenience, every district is
taken to have total population 1600. In examples 2 and 3, the dark blue (darkest
gray) area is water, creating a boundary for the “state.” All three examples of
Figure 5 show a uniformly distributed population in this “C” shape. Examples 2
and 3 show how the boundary of the state near the boundary of the district affects
the compactness score. For each example, we calculate the compactness score for
several of the compactness measures described. “Corcoran” denotes the measure
6 http://nationalatlas.gov/printable/images/pdf/congdist/IL04_110
10 CARL CORCORAN AND KAREN SAXE
8 8 8
100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100
6 6 6
100 100 100
2 4 6 8 2 4 6 8 2 4 6 8
introduced in the previous section, and “C&M” denotes the discrete version of
Chambers and Miller’s measure evaluated for q = 1, and will be referred as simply
the “Chambers and Miller measure” for the remainder of the section.
Example 3 demonstrates that when a district follows a state border, when given
no other choice, both path-based measures (Corcoran, Chambers and Miller) score
perfectly when there are no other irregularities. The other three measures do not
take into consideration the state border.
When the population is evenly distributed, both Chambers and Miller’s mea-
sure and our measure will score any given district identically. Suppose that in a
district D, every census block bi has the same population p. Then, P = kp and
hence this paper’s measure will give the score
1 dS (xi , xj ) 2p dS (xi , xj ) 1 dS (xi , xj )
(p + p) = = 2 .
2kP dD (xi , xj ) 2k2 p dD (xi , xj ) k dD (xi , xj )
bi ∈D bj ∈D bi ∈D bj ∈D bi ∈D bj ∈D
So, if each census block has the same population, there will be no difference in
the measures’ scores. However, while it is an interesting diversion, such a case is
hardly of practical concern. It is difficult to imagine a congressional district where
population is evenly distributed across census blocks.
If we turn our attention to the same “C” shaped district, but instead consider
non-uniform population distribution, we begin to see differences between our mea-
sure and Chambers and Miller’s measure (Figure 6). Examples 4 and 5 should be
compared to Example 1, above. Again, the other three measures fail to capture
any change.
In Example 5, we see that Chambers and Miller’s measure fails to penalize
for population dispersion. Indeed, as long as the population is concentrated in
a convex portion of the (most likely non-convex) district, Chambers and Miller
gives a perfect score of 1, while our measure identifies a potential manipulation of
boundaries. Some would argue that an irregular district boundary doesn’t matter
REDISTRICTING AND DISTRICT COMPACTNESS 11
Example 4 Example 5
8 8
10 0
4 4
10 0
10 0
2 2
10 10 10 10 10 10 0 0 0 0 0 0
2 4 6 8 2 4 6 8
in this case, but we believe that it can be cause for concern. For example, district
boundaries usually stay in place for a decade; it is possible to predict development
and growth (perhaps most notably in suburbs of large metropolitan areas) over such
a time frame and those in charge of redistricting may be privy to information that
could give motive for such line drawing. Second, in many states, line drawers are
bound to keep new districting plans as close as possible to old ones. In other words,
a strange empty, reaching arm drawn today could become a strange populated arm
in the future (and vice versa).7 Last, it is a challenge to find a districting plan
with a unique gerrymandered district; if one district is bizarrely shaped, chances
are that other nearby districts are too.
We end this section by remarking that a fair criticism is that the path-based
measures in Figures 5 and 6 give values that are very high (close to 1). Taking
different values of the parameter q (which can be done for both Corcoran and
Chambers and Miller) will result in a wider spread of values, and taking q = ∞
does so most dramatically.
that any convex district is ranked highly. For example, an elongated rectangle in
California, stretching the entire state from southeast to northwest, would be given
a high score with the path-based measures discussed herein (Figure 7). How should
we evaluate the quality of measure(s) of compactness? Which measures should be
adopted during the redistricting process?
There have been several surveys of compactness measures, notably Young
(1988) and Niemi, et al. (1991) (see also [1], [7]). Young surveyed eight differ-
ent tests of compactness, ranging from the simple “visual test” to the more so-
phisticated Reock test. After looking at each of these measures and determining
their strengths and weaknesses, Young concludes with five desirable properties of
compactness measures (Table 1).
While far from exhaustive, each of these criteria describes a desirable facet of
measuring compactness. The first criterion suggests that compactness scores are
merely indicators of gerrymandering. As Figure 8 illustrates, gerrymandering can
occur with shapes that remain, at least facially, fairly compact. It is not the value
of a particular score that gives information about gerrymandering. Instead, we are
interested in the relative ordering of different plans determined by that particular
score.
The second criterion is a much more pragmatic one. In redistricting, com-
pactness is most often applied when developing a plan, or when adjudicating a
gerrymandered district. Due to the reluctance of federal and state courts to in-
ject themselves in the political process of redistricting and the perceived difficulty
in fashioning an appropriate judicial remedy, gerrymanders other than the most
REDISTRICTING AND DISTRICT COMPACTNESS 13
D R D R D R D R D R D R
R D R D R D R D R D R D
D R D R D R D R D R D R
R D R D R D R D R D R D
D R D R D R D R D R D R
R D R D R D R D R D R D
egregious are rarely the subject of litigation. Thus, it could be argued that the
primary use of compactness scores is in the creation of a districting plan which
must, necessarily, be considered as a whole.
The third criterion stems from another principle of redistricting, that of pre-
serving integrity of communities of interest and of political subdivisions. Under
this criterion, population-based measures are preferable to measures based only on
geometry.
The fourth criterion prevents discrimination between large rural and small ur-
ban districts, as rural districts are necessarily large to ensure equal population.
This criterion alone suggests that the so-called Perimeter measure, whose value is
merely the perimeter of the district, is a poor choice.
The fifth criterion, while still desirable, is anachronistic insofar as computers
have become more and more powerful, and redistricting software more and more
common. However, overly complicated measures can be difficult to implement and
interpret, which means that only a small cadre of experts can be relied upon to
detect and evaluate gerrymanders. As a result, this criterion helps ensure that
transparency remains a priority in the redistricting process.
Niemi et al.’s survey two years later had broader scope, examining a longer
list of existing measures. Furthermore, they classify the measures based on the
main ingredient of compactness they utilize. After categorizing the measures, the
authors, like Young, reflect and give criteria for compactness measures (Table 2).
For the first criterion, one need only look at Colorado and Maryland side by
side to justify its inclusion. For nearly every measure, the districts of Maryland
will be less compact than the districts of Colorado. Maryland, of course, has a
jagged, incising coastline which skews the score of most compactness measures.
But these are forgone conclusions, as state borders do not change and congressional
districts are subject to these boundaries. So, using most traditional compactness
measures, comparisons across states are inappropriate. With the introduction of
14 CARL CORCORAN AND KAREN SAXE
path-based measures, we can examine anew the importance of this criterion and
start to compare scores across states.
The second criterion is essentially the same as Young’s first criterion.
The third criterion is again a standard application of traditional districting
principles, which have been affirmed and reaffirmed by the Supreme Court.
The fourth criterion is a commonsense acknowledgement of the imperfection of
compactness measures applied alone. As the meandering district of Figure 2 and
the rectangular district in California of Figure 7 demonstrate, there are classes of
shapes that each compactness measure fails to identify as non-compact. So, scoring
poorly by one measure should not be cause for concern. However, a district that
scores poorly by several compactness measures is clearly less compact than one that
scores poorly by only one measure. Thus, the use of multiple measures is a practical
safeguard against improperly identifying a district as compact or non-compact.
Broadly speaking, these criteria can be classified into two categories: imple-
mentation criteria and design criteria. The former are used when creating and/or
evaluating a districting plan; the latter are used when evaluating the quality of a
measure itself. The implementation criteria are Young’s first and second, and all
four of Niemi et al.’s criteria. The design criteria are Young’s third, fourth, and
fifth. Even though the traditional measures can be applied successfully according
to the implementation criteria, many do not stand up well with respect to the de-
sign criteria. The emergence of the class of path-based population hybrid measures
gives evidence that the design criteria are gaining importance as this field develops
and matures.
Assessing compactness is further complicated by a lack of acknowledgement of
implementation criteria by state and local entities. In practice, most states require
districts to be compact, but many give no particular measure or measures to be
used.
We now turn to the recent bout of redistricting in Minnesota for an illustration
of what can happen. On November 4, 2011 an order8 “stating principles and
requirements for Minnesota plan submissions” was filed, and requires inclusion of
a report
stating the results of Reock, Schwartzberg, Perimeter, Polsby-
Popper, Length-Width, Population Polygon, Population Circle,
and Ehrenberg measures of compactness for each district.
Consider two plans submitted to the Minnesota Commission, dubbed Hippert and
Martin (Table 3).
Recall that, for each measure, numbers closer to 1 indicate a more compact district.
Note that the plans look equally good with Reock, Polsby-Popper favors Martin,
while the population circle (and necessarily Schwartzberg) favor Hippert. In other
words, we can have two plans A and B for a given state, and two compactness
measures m1 and m2 , so that A is more compact than B using m1 , yet B is more
8 In spring of 2011, the MN legislature presented a state redistricting plan to Gov. Mark
Dayton who quickly vetoed it. A gubernatorial veto also occurred in 3 of the last 4 decades
(1971, 1981 and 2001). In response to the veto, the MN Supreme Court Chief Justice appointed a
special 5 member Judicial Commission to come up with a plan by February 21, 2012, in case the
legislature continued to fail to produce a plan that the Governor could approve. The November 4
order was given by the Judicial Commission 3 months prior to their deadline for choosing a plan,
and only 2 weeks before their November 18, 2011 deadline for receiving plans from groups. The
order can be found at http://www.mncourts.gov/?page=4469
REDISTRICTING AND DISTRICT COMPACTNESS 15
compact than A using m2 . This example illustrates the reason why Niemi et al.’s
fourth criterion is important, but also shows that the use of multiple measures
might in fact lead to confusion within the political conversation.
On the one hand, Minnesota meets the implementation standards of Young
and Niemi et al. However, the measures used could be criticized for their design.
For example, the Perimeter score for a given district is simply the perimeter of that
district and clearly violates Young’s fourth criterion.
The Supreme Court’s 1993 endorsement of traditional districting principles in
Shaw v. Reno solidifies the role that compactness measures will play in future
redistricting. Most existing measures are geometrically based, and some take into
consideration the spread of the population within the districts. We agree with pre-
vious authors that compactness measures should be used only to compare amongst
plans, and several measures should be applied; they cannot be used to determine
whether or not a single district has been gerrymandered.
In terms of the criteria as outlined by Young and Niemi et al., how do path-
based population hybrid measures hold up? These measures consider the shape –
and not size – of districts; they can be used to measure an entire plan (by consid-
ering the range and distribution of scores for districts in the plan); census blocks
are used and block populations and geographic centers are freely available online.
Importantly, these measures stand apart from the other measures in that they can
be used to compare across states, since state borders are taken into account.
In the end, compactness cannot ensure fair representation. This said, measures
of compactness should be used to assess districting plans, in conjunction with other
tools. Further, it may be that, in the future, information about the demographics
of the district population (besides simple distribution) will play increasingly im-
portant roles as ingredients in compactness measures. A strength of all path-based
measures, that iterate over voters or census blocks, is the potential for popula-
tion data stratification. The idea is that instead of trying to find gerrymanders in
general, we can look for specific types of gerrymanders. Instead of simply using
raw population, we can stratify by political leaning, race, or any other cognizable
political grouping that can be used in gerrymandering. For instance, a score can
be computed for only Democrats in the district. This can then be compared to a
score for Republicans, or raw population. If the Democratic score is significantly
lower, it may indicate a Democratic gerrymander; the shape of the district may
have been manipulated to pack more Democrats into the district at the expense of
compactness. While this causal relationship might not always hold, the potential
for stratification could have a great impact on gerrymander identification. Thus,
there may be more states that move away from using simply geometrically-based
compactness measures as a way to control and detect gerrymandering.
16 CARL CORCORAN AND KAREN SAXE
References
[1] M. Altman, Districting Principles and Democratic Representation, California Institute of
Technology, 1998.
[2] D. Bauer, R. Cheetham, and T. Manik-Perlman, Redrawing the Map on Redistricting 2010: A
National Study (2010), available at http://www.azavea.com/research/company-research/
[3] J.L. Carson, M.H. Crespin, C.J. Finocchiaro, and D.W. Rohde, Redistricting and Party Polar-
ization in the U.S. House of Representatives, American Politics Research 35 (2007), 878–904.
[4] Census Blocks and Block Groups, Geographic Areas Reference Manual, United States Census
Bureau, 1994.
[5] C.P. Chambers and A.D. Miller, A Measure of Bizarreness, Quarterly Journal of Political
Science 5 (1) (2010), 27–44.
[6] Christopher P. Chambers and Alan D. Miller, Measuring legislative boundaries, Math. Social
Sci. 66 (2013), no. 3, 268–275, DOI 10.1016/j.mathsocsci.2013.06.001. MR3128484
[7] R.G. Fryer Jr. and R.T. Holden, Measuring the Compactness of Political Districting Plans,
National Bureau of Economic Research, Cambridge, MA, 2007, available at http://papers.
nber.org/papers/w13456
[8] Jonathan K. Hodge, Emily Marshall, and Geoff Patterson, Gerrymandering and convexity,
College Math. J. 41 (2010), no. 4, 312–324, DOI 10.4169/074683410X510317. MR2682920
(2011g:91059)
[9] T. Hofeller and B. Grofman, Comparing the Compactness of California Congressional Dis-
tricts under Three Different Plans: 1980, 1982, and 1984, Political Gerrymandering and the
Courts, Agathon, New York, 1990, 281–288.
[10] Karcher v. Daggett, 462 U.S. 725 (1983).
[11] J. Levitt and E.L. Wood, A Citizen’s Guide to Redistricting, Brennan Center for Justice at
New York University School of Law, New York, 2010.
[12] M.D. McDonald and R.L. Engstrom, Detecting Gerrymandering, Political Gerrymandering
and the Courts, Agathon, New York, 1990, 178–202.
[13] R. Morrill, A Geographer’s Perspective, Political Gerrymandering and the Courts, Agathon,
New York, 1990, 212–239.
[14] R.G. Niemi, B. Grofman, C. Carlucci, and T. Hofeller, Measuring Compactness and the Role
of a Compactness Standard in a Test for Partisan and Racial Gerrymandering, The Journal
of Politics 52 (4) (1990), 1155–1181.
[15] L. Papayanopoulos, Democratic Representation and Apportionment: Quantitative Princi-
ples Underlying Apportionment Methods, Annals of the New York Academy of Sciences 219
(1973), 181–191.
[16] D.D. Polsby and R.D. Popper, The Third Criterion: Compactness as a Procedural Safeguard
against Partisan Gerrymandering, Yale Law & Policy Review 9 (2) (1991), 301–353.
[17] E.C. Reock, A Note: Measuring Compactness as a Requirement of Legislative Apportionment,
Midwest Journal of Political Science 5 (1) (1961), 70–74.
[18] Reynolds v. Sims, 377 U.S. 533 (1964).
[19] Shaw v. Reno, 509 U.S. 630 (1993).
[20] J.E. Schwartzberg, Reapportionment, Gerrymanders, and the Notion of “Compactness”,
Minnesota Law Review 50 (1965), 443–448.
[21] R.S. Stern, Political Gerrymandering: A Statutory Compactness Standard as an Antidote
for Judicial Impotence, The University of Chicago Law Review 41 (2) (1974), 398–416.
[22] H.P. Young, Measuring the Compactness of Legislative Districts, Legislative Studies Quar-
terly 13 (1) (1988),105–115.
Mathematics Department, St. Paul Academy and Summit School, St. Paul, Min-
nesota 55105
E-mail address: [email protected]
1. Introduction
Redistricting is the political practice of dividing states into electoral districts
of equal population. It is mandated to occur every ten years, after the census,
to ensure equal representation in the legislative body. Where the boundaries are
drawn can dramatically alter the number of districts a given political party can
win. As a result, a political party which has control over the legislature, can
(and does) manipulate the boundaries to win a larger number of districts, thus
affecting the balance of power in the U.S. House of Representatives. This kind of
boundary manipulation occurs even with certain legal and legislative constraints
that restrict some aspect of how districts can be drawn and mandate that, where
appropriate, districts should be created to have a majority of voters consisting of
a racial minority. (See [11] for a detailed summary of these constraints.)
The ability of one political party to gain political advantage by carefully choos-
ing the boundaries during redistricting has been recognized as a serious problem
with the redistricting process in the United States; we shall refer to this as the prob-
lem of partisan unfairness. Attempts to fix and/or mitigate the problem of partisan
unfairness (beyond the legal restrictions) have taken one of two approaches: try-
ing to constrain the process to reduce the amount of political gain achievable, and
2014
c American Mathematical Society
17
18 ZEPH LANDAU AND FRANCIS EDWARD SU
trying to remove politics from the redistricting process. Examples of the first ap-
proach include attempting to limit the power of the drawing party by more strictly
prescribing the allowable shapes of districts, and banning the use of registration
and voting data within the redistricting process. Examples of the second approach
include assigning the task of redistricting to bipartisan or non-partisan panels1 ,
and using computer programs to generate redistricting maps that optimize certain
carefully chosen criteria.
Landau-Reid-Yershov [6] took a different approach to provide a novel solution
to the problem of partisan unfairness: rather than trying to fix the problem by
restricting the shape of the possible maps or by assigning the power to draw the
map to nonbiased entities, their solution ensures fairness by balancing competing
interests against each other. This kind of solution is an example of what are known
as “fair division” solutions— such solutions account explicitly for the preferences of
all parties, are determined by a procedure in which all parties are actively involved,
and are accompanied by rigorous guarantees of a specified notion of fairness.
The goal of this article is to provide an exposition of this redistricting method
in the context of a detailed sample “map”, and make a stronger connection to the
ideas of fair division than is provided in [6]. In particular, we propose a specific
notion of fairness that was used but not made explicit in [6]. This notion of fairness
can be used in concert with (not substitute for) other necessary or desired criteria
for a good redistricting solution. We clarify how fair division ideas can play an
important role in a realistic redistricting solution by introducing an interactive step
that involves multilateral evaluation, procedural fairness, and fairness guarantees.
And by making the bridge between fair division ideas and redistricting solutions
more explicit, we hope to encourage the flow of ideas between the two areas.
We begin with an introduction to the ideas of fair division in Section 2. We
describe the problem of partisan unfairness in more detail in Section 3. Despite the
ability to easily recognize the “unfairness” inherent in the redistricting process, it
has been hard to give a reasonable definition of what would be fair. In Section 4
we give an explicit definition of fairness—the geometric target—that incorporates
geometric considerations such as how constituent voters are distributed and how
districts are shaped. Section 6 examines the protocol of [6] by analyzing its behavior
in a specific example that demonstrates most aspects of the solution. Within this
example, the fairness of the protocol is discussed in detail.
We stress that the ideas discussed here are well suited to be combined with
other necessary or desired criteria for a good redistricting solution. Our definition
of fairness involving geometric targets in Section 4 can incorporate independently
desired requirements for district shape or competitiveness. The protocol of [6] can
be easily adjusted to approximate fairness under these additional requirements. If a
solution involving an independent commission is desired, these definitions of fairness
can be used as a target for the commission. Similarly if a computer assisted solution
that consists of optimizing some function is sought, this measure of fairness can be
used as a component of the function to be optimized. Separately, the simple fair
division ranking protocol given in Section 5 can be used to incorporate some degree
of legislative preference within any proposed redistricting solution that otherwise
does not include legislative influence.
2. Fair Division
The problem of fair division, as Steinhaus [14] put it, is essentially a question
of how to divide some object fairly. Usually, this object is affectionately referred to
as cake [12], but in general it could be desirable or undesirable (e.g., the division
of chores [9]) or a mixture of desirable and undesirable goods [15]. The cake may
be infinitely divisible (as we usually regard real cake) or only divisible into discrete
pieces (such as a pieces of an estate). Applications of fair division ideas include
methods for resolving international disputes and divorce settlements [3].
There are several notions of fairness that one might consider, but an important
aspect of fair division problems is that this fairness notion is evaluated by the parties
involved in the negotation, rather than an outside arbiter. Thus, the outcome of
a fair division procedure will give each party what it considers to be a fair share,
according to its own evaluation.
The simplest example of a fair division procedure is the familiar “I-cut-you-
choose” method for dividing a cake among two people. One might consider a fair
piece in which each party does not envy the other; we call such a solution an envy-
free solution. Again, note that envy is measured by each party according to its
own evaluation. If one person cuts the cake (into two pieces that she is indifferent
between) and the other person is allowed to choose first (picking the piece that he
most desires), then both people will end up with a piece for which they experience no
envy. This simplest of all fair division procedures already highlights some interesting
features common to all fair division procedures:
(1) Multilateral evaluation. Fairness is evaluated according to each party’s
own preferences. Therefore, parties don’t have to agree on what is valu-
able; each will obtain a share they would consider fair in their own esti-
mation (and they do not need to know the other party’s preferences).
(2) Procedural fairness. There is a process by which preferences are elicited,
all parties are involved in the process, and they understand the criteria
by which fairness is measured. Because of this, parties are more likely
to feel that the process is fair, more so than a decision imposed by an
outside arbiter (see e.g., [13]). The procedure guides parties to a mutually
acceptable division.
(3) Fairness guarantee. By following the procedure, as long as you tell the
truth about your preferences, you will obtain what you feel is fair (even
if everyone else lies). Thus there is an incentive to be truthful (and if you
lie about your preferences, it can backfire).
A reader might object to the above particular cake-division solution, because
the cutter only gets what he perceives to be 50 percent of the cake in his valuation,
while the other person might end up with more. The solution is envy-free (neither
person envies the other person’s share) but it is not equitable, meaning the perceived
share of cake each player gets (in his own valuation) is different. This is not a deficit
of the procedure (which only guaranteed envy-freeness and not equitability) as much
as it is a fault in choice of procedure. An active area of research in mathematics
[12], economics [7], and political science [3] is the development of fair division
procedures in various settings and with various fairness criteria.
A more interesting fair division solution that has found application by practi-
tioners is the Adjusted Winner procedure of Brams and Taylor [4]. It is a procedure
20 ZEPH LANDAU AND FRANCIS EDWARD SU
for dividing a set of goods between two parties in such a way that the division is:
envy-free, equitable, and efficient (or Pareto-optimal). The last criterion means
that there is no division that dominates the given one, i.e., there is no other divi-
sion that is just as good for both parties and strictly better for one party. Thus
the Adjusted Winner solution gives each party a share for which they do not wish
to trade shares, and in which they feel they got just as good a portion as the other
party feels it got, and there is no other solution that dominates. At most one of
the goods may have to be divided in the procedure (though one cannot predict
beforehand which good it is). We note that if there are more than two parties,
it may not always be possible to satisfy these three properties in cake division, as
discussed in [2].
The Adjusted Winner procedure has found application in divorce settlements
[4] because of its fairness guarantees as well as its ease of use. In the procedure,
both parties are given 100 points to divide by assigning over the objects. This is
the part of the procedure where preferences are elicited. Then objects are initially
given to the party that valued them most; such a division is efficient, but it may not
be envy-free or equitable. Call the party who ends up with the largest fractional
share (in its own evaluation) the winner, and the other, the loser. In the next phase
of the procedure, the assignment of goods is “adjusted” by transferring goods from
the winner to the other party in a particular order until both fractional shares are
equalized.
The Adjusted Winner procedure has the 3 features described above for a fair
division procedure. It has a fairness guarantee: what results is an outcome that
is provably envy-free and efficient, in addition to being equitable. It relies on
mutilateral evaluation: the preferences of both parties are taken into account, and
the resulting division achieves the fairness guarantee for both parties using their own
estimation. And it is procedurally fair: parties using the Adjusted Winner procedure
can understand and verify the fairness guarantees for a particular solution (without
having to understand the proofs); because they participated in the procedure by
stating their preferences, they are more likely to feel that the outcome is fair.
As we shall see, these three fair division ideas can offer some helpful ideas to
current thinking about redistricting which can be combined with other desired ideas
for a good redistricting solution. They underlie the redistricting procedure of [6]
that we will now explain. First, we will explain the problem of partisan unfairness
that [6] attempts to address.
however, in most cases, the drawing party can still win a significantly larger per-
centage than X% of the districts, even with only partial knowledge of voting trends.
This is not just a theoretical issue, as has often been demonstrated when the
party in control changes. We cite two examples:
• When Republicans took control of the Texas legislature in 2002, they
redrew state districts mid-decade, and the Texas delegation changed from
15 Republicans and 17 Democrats to 22 Republicans and 10 Democrats.
[8]
• In Michigan, the 2000 election produced 7 Republican representatives and
9 Democratic representatives. After the census, a new district map was
drawn resulting in 9 Republican representatives and 6 Democratic repre-
sentatives in the 2002 election (Michigan lost 1 seat due to the census).[8]
This ability of one party to draw districts in such a way as to gain political
advantage is viewed as one of the major problems with redistricting in the United
States; we shall refer to this as the problem of partisan unfairness. The districting
protocol proposed in [6] avoids this inherent unfairness by ensuring that either
party can win a percentage of districts that is very close to their fair share.
Implicit in calculating Rwin is a voting model V that the party is using to predict
which districts it will win.
In reality, a party’s interests may be much more complicated then just the
number of districts that it wins. A more general rating system is one where a party
could rate the desirability of each district in a division (assigning it a number), then
sum these numbers over all the districts to give a rating for the division. As in [6],
we shall refer to such a rating sytem as an additive rating system, and denote any
particular instance of it as Rsum . The rating system Rwin is a special case of Rsum
in which a party assigns a 1 to districts it expects to win and a 0 to those it expects
to lose.
The more general rating system Rsum allows a party to take other considera-
tions into account. Politically, these can be important, as the following examples
demonstrate:
• perhaps some district has an incumbent who is on an important congres-
sional committee, so winning that district is more valuable to the party
(hence rated higher than other winnable districts),
• perhaps some district has an important landmark (a stadium or a con-
struction project) worth more to a party than some other district,
• perhaps some district encompasses the supporters of two incumbents from
the opposition party, so that even though the district will be lost, the
elimination of one strong incumbent from the other party is valuable.
Equipped with the notions of viable division, voting model, and rating system,
we can now define our fairness notion:
The absolute geometric target is a definition of fairness that takes into account
geometric constraints. There are several compelling reasons why this is a good
definition. First, the definition seems conceptually fair as it lies exactly between
the best and worst outcome (in terms of number of districts won) for each party.
Second, when there are no geometric constraints, the absolute geometric target
coincides with the percentage of constituent voters—as already mentioned, if the
minority party has X% of the vote, its best outcome is to win about 2X% of the
districts, while its worst outcome is to lose all the districts 0% and so the absolute
geometric target in this case would be approximately 2X+0 2 = X% of the districts.
Third, because this definition uses only viable district maps, it incorporates geo-
metric constraints by restricting attention to realizable outcomes. For instance, in
the example at the beginning of this section (with the homogeneous 60% - 40%
electorate split), both the best and worst rating for the minority party would be to
win 0 districts which is the only possible outcome (and thus fair) in that case. We
remark that the absolute geometric target could be used as a target for fairness for
independent commissions.
The protocol for districting proposed in [6] (and described subsequently), allows
each party the opportunity to achieve an outcome that is close to their own absolute
geometric target. Moreover, it allows each party the opportunity to achieve an
FAIR DIVISION AND REDISTRICTING 23
outcome that is close to a geometric target with respect to any voting model V and
any additive rating system R.
Note that the geometric target with respect to a voting model V and a rating
system R is a notion that captures the fair division principle of multilateral evalua-
tion: that party preferences should be taken into account. Each party has its own
geometric target, based on its own voting model V and rating system R(derived
from its preferences). This is to be distinguished from any absolute notions of fair-
ness that might be imposed by an external arbiter (including the absolute geometric
target).
We shall soon see that districting protocol of [6] will, in addition, possess the
other features of a fair division procedure— procedural fairness, and a fairness
guarantee (see Section 2).
6.1. The State. Consider the map in Figure 1, which represents a state con-
sisting of 25 parcels, which we are thinking of as indivisible units (here, they are
rectangles or squares). Each parcel contains the same number of people; thus the
smaller the parcel, the denser the population. Loosely, we can think of this state
as having a city located at the small squares (T,U,V,W,X,Y), with suburban areas
surrounding the city, and the remaining areas rural.
24 ZEPH LANDAU AND FRANCIS EDWARD SU
Suppose our goal is to divide this state into 5 districts, each containing exactly
5 parcels.
A B C D
E F G H
I J K L M
T U V N
O P W X Y Q
R S
Suppose that the voting outcome (Vout ) of the ensuing election is given by
Figure 2, which shows in each parcel the percentage of votes that party A receives.
.47 .53
The ability to use the redistricting process to gain political advantage relies
on the ability to predict some features of Vout . In general, of course, Vout is not
known precisely at the time districts are being re-drawn. However, the combination
of data from previous elections and opinion polling increasingly gives a more and
more accurate model of how votes will be distributed. For the purposes of this
example we will assume that the working voting model V of both parties coincides
with Vout in Figure 2. In this example, party A has a slim statewide majority,
receiving 50.12% of the total vote.
For this particular example, we will assume that the only thing the two parties
care about is maximizing the number of districts they can win; thus their preferences
are diametrically opposed with each having rating system Rwin . We emphasize that
this is an assumption we make for this example but that the protocol is designed
FAIR DIVISION AND REDISTRICTING 25
to work under much more general preferences—the additive ratings system Rsum ,
discussed earlier.
Notice that even though parties A and B each have approximately half the
voters over the state, if either party is given complete control of the district-drawing
process, they can draw districts so that they are the majority in 4 of the 5 districts.
See Figure 3. In this example, the absolute geometric target for either party is
1+4
2 = 2.5 districts.
.53 .53 .59 .65 .65 .47 .53 .53 .59 .65 .65 .47
.53 .53 .53 .41 .35 .53 .53 .53 .41 .35
Figure 3. The left diagram shows a division in which A can win
4 districts. The right diagram shows a division in which B can win
4 districts. Districts that A wins are shaded.
6.2. The Protocol. There are three core steps to the redistricting protocol
along with an augmenting fourth step. After outlining them in general, we will
work through each step in the above example.
• Split Sequence Generation. This step is performed by the independent
agent I. The agent generates a sequence of so-called k-splits: a k-split
is a division of the state into two pieces (piece 1 and piece 2) such that
the population within piece 1 totals the number of people in k districts.
The independent agent I generates a split sequence: a 1-split, a 2-split, a
3-split, etc. with each split building on the previous so that piece 1 of the
j-split contains piece 1 of the (j − 1)-split for all j.
• Preference. For each of the k-splits, the two parties are each asked which
of the following options they would prefer:
(1) to have party A divide piece 1 of the split into k districts and have
party B divide piece 2 into n−k districts (where n is the total number
of districts).
(2) to have party B divide piece 1 of the split into k districts and have
party A divide piece 2 into n − k districts.
Each party has the option of saying that they are indifferent to the
two choices2 .
• Resolution. If there exists an i-split such that parties A and B both
prefer the same option in the preference step above then create a map using
that option. If there exists an i-split such that one party is indifferent,
2 This additional option is a modification of the original protocol in [6].
26 ZEPH LANDAU AND FRANCIS EDWARD SU
then create a map using the option selected by the party that was not
indifferent. If there exists an i-split such that both parties are indifferent,
then create a map by randomly choosing one of the options for that i-split.
If none of the above scenarios occur it means that the parties have
opposite preferences for each i. Find the first i0 , 1 ≤ i0 ≤ n − 2 for which
party A prefers option (2) for i = i0 and switches preferences to option
(1) when i = i0 + 1. (This scenario is guaranteed to occur at least once
since party A prefers option (2) when i = 1 and prefers option (1) when
i = n − 1.) Randomly choose to divide the state from the following four
prescriptions:
i. use option (1) for the i0 -split,
ii. use option (2) for i0 -split,
iii. use option (1) for (i0 + 1)-split,
iv. use option (2) for (i0 + 1)-split.
• Augmentation. We perform the above 3 steps for a number of different
split sequences to produce a number of divisions of the state. We then
use the ranking protocol of Section 5 to choose among these maps, i.e.
each party ranks the divisions (from best to worst) according to their own
preferences, and the division whose worst ranking (among both parties)
is highest is the one that is chosen. If there are two such splits, select one
of them at random.
6.3. The protocol in action. We now show how the protocol works for the
example described in Figure 2.
Split Sequence Generation step. Suppose the Split Generation step yields the
split sequence in Figure 4. In each diagram piece 1 will be the left piece and piece
2 will be the right piece.
Preference step. For the 1-split, we exhibit in Figure 5 a possible way each
party can optimize its own interests in each of the two options. In option (1), A
divides piece 1 and B divides piece 2. In option (2), B divides piece 1 and A divides
piece 2.
FAIR DIVISION AND REDISTRICTING 27
.53 .53 .59 .65 .65 .47 .53 .53 .59 .65 .65 .47
.53 .53 .53 .41 .35 .53 .53 .53 .41 .35
Figure 5. Two options for the 1-split. The first diagram is option
(1): in which A divides the left piece and B divides the right piece.
The second diagram is option (2) in which B divides the left piece
and A divides the right piece.
Thus party A would prefer option (2) (since it would win 4 out of 5 districts)
and party B would prefer option (1) (since it would win 4 out of 5 districts). This
is not surprising since there is no opportunity to gerrymander the left piece.
We then consider the same question for the 2-split. Figure 6 shows one possible
way each party can optimize its own interests in each of the two options. Here,
party A will still prefer option (2) and party B would still prefer option (1).
.53 .53 .59 .65 .65 .47 .53 .53 .59 .65 .65 .47
.53 .53 .53 .41 .35 .53 .53 .53 .41 .35
Figure 6. Two options for the 2-split. The first diagram is option
(1): in which A divides the left piece and B divides the right piece.
The second diagram is option (2) in which B divides the left piece
and A divides the right piece.
For the 3-split, Figure 7 shows one possible way each party can optimize its
own interests in each of the two options. Notice that now the parties preferences
have changed: party A now prefers (1) and party B prefers (2).
28 ZEPH LANDAU AND FRANCIS EDWARD SU
.53 .53 .59 .65 .65 .47 .53 .53 .59 .65 .65 .47
.53 .53 .53 .41 .35 .53 .53 .53 .41 .35
Figure 7. Two options for the 3-split. The first diagram is option
(1): in which A divides the left piece and B divides the right piece.
The second diagram is option (2) in which B divides the left piece
and A divides the right piece.
Finally, for the 4-split, Figure 8 shows one possible way each party can optimize
its own interests in each of the two options. Again, party A prefers (1) and party
B prefers (2).
.53 .53 .59 .65 .65 .47 .53 .53 .59 .65 .65 .47
.53 .53 .53 .41 .35 .53 .53 .53 .41 .35
Figure 8. Two options for the 4-split. The first diagram is option
(1): in which A divides the left piece and B divides the right piece.
The second diagram is option (2) in which B divides the left piece
and A divides the right piece.
We summarize the results from party A’s point of view in the following table:
FAIR DIVISION AND REDISTRICTING 29
With all the preferences now stated, we are ready to move to the final step of
the protocol.
Resolution Step. Since the two parties prefer different options in each of the
four splits, we find the point at which party A’s preference switches; here it occurs
between the 2-split and the 3-split. Thus i0 = 2 and we randomly choose between
the four prescriptions corresponding to the options listed in the second and third
row of Table 1.
i. option (1) for the 2-split with the result: party A wins 2 districts, party
B wins 3.
ii. option (2) for the 2-split with the result: party A wins 3 districts, party
B wins 2.
iii. option (1) for the 3-split with the result: party A wins 3 districts, party
B wins 2.
iv. option (2) for the 3-split with the result: party A wins 2 districts, party
B wins 3.
Notice that these results have party A winning either 40% or 60% of the dis-
tricts, the two closest achievable percentages to both the percentage of votes for
party A (50.12%) and the percentage of districts given by the absolute geometric
target (2.5/5 = 50%). This is the result that the protocol is designed to produce;
it is argued in [6] that a rigorous result establishing a good choice property of the
protocol combined with the way i0 is chosen will result in this kind of behavior for
most choice of split sequences. We discuss this in the next section.
Augmentation Step. For our example, we run the same protocol for the follow-
ing four additional split sequences:
These have the following outcomes from party A’s perspective, listed in Table 2.
Unlike the first split sequence from Figure 4 that we explored in detail, the
result of each of these split sequences is that the parties will be indifferent to one
split. Here are possible maps from the results of the protocol for each of these split
sequences:
.53 .53 .59 .65 .65 .47 .53 .53 .59 .65 .65 .47
.53 .53 .53 .41 .35 .53 .53 .53 .41 .35
.53 .53 .59 .65 .65 .47 .53 .53 .59 .65 .65 .47
.53 .53 .53 .41 .35 .53 .53 .53 .41 .35
.53 .53 .59 .65 .65 .47 .53 .53 .59 .65 .65 .47
.53 .53 .53 .41 .35 .53 .53 .53 .41 .35
.53 .53 .59 .65 .65 .47 .53 .53 .59 .65 .65 .47
.53 .53 .53 .41 .35 .53 .53 .53 .41 .35
In the ranking protocol, both parties would rank the 5 different outcomes from
best to worst. Depending on the outcome of the random choice of prescription
for the first split sequence (Figure 4), the outcomes would be party A winning 3
districts 3 or 4 times (Horizontal Split sequence, both Diagonal split sequences,
FAIR DIVISION AND REDISTRICTING 33
and possibly the first split sequence), and winning 2 districts 1 or 2 times (Vertical
split sequence, and possibly the first split sequence). Since the result of the ranking
protocol will result in one of the top three outcomes for both parties, in this case,
the final outcome will be party A winning 3 districts and party B winning 2. We
remark that this resolution is as close as one can get to both the absolute geometric
target (2.5) and the proportion of constituent voters (50.12 %)
6.4. Fairness qualities of the protocol. Having explained the protocol we
turn to a discussion of why the protocol is fair. We will analyze the protocol from
the point of view of party A; the identical analysis can be made for party B. We
address the following two questions:
• If the map is created from a choice party A preferred or was indifferent
to, will it be fair for party A?
• What if the randomness in the algorithm results in a choice party A did
not prefer?
The first question is answered in the affirmative by establishing that the proto-
col has the good choice property [6]. Approximately, the good choice property says
that if party A is using a voting model V and has an additive rating system R, there
will be a choice for party A that achieves an outcome that is at least as good as a
number close to the geometric target for V and R (see Section 4 for definitions).
Precisely, for a given k-split define the party’s k-split geometric target for V and R
to be the average rating of the best and worst outcomes over all viable divisions
that include the dividing line of the k-split. Then the good choice property is:
Theorem 6.1 (Good Choice Property [6]). For any voting model V and rating
system R, one of the choices given in the protocol achieves an outcome that is at
least as good as the party’s k-split geometric target for V and R.
For our example above party A has been acting according to its interests with
V = Vout and R = Rwin . The good choice property follows from the following
observation that is perhaps best seen pictorially (see Figure 17): given a particular
k-split, the average of the number of districts won by party A under options (1)
and (2) is equal to the average of the number of districts that party A wins if it
had complete control (which would result in the best outcome for party A) and if
it had no control (which would result in an outcome no worse than the worst for
party A).
Thus at least one of the two options is better than the average outcome between
the best and worst scenario for party A, which is precisely the definition of the k-
split geometric target when V = Vout and R = Rwin . It should be clear that the
same argument holds regardless of choice of V and additive rating system R.
The astute reader will notice that the k-split geometric target for V and R can
differ from the geometric target for V and R; in other words, insisting that the
division includes the boundary given by the k-split can penalize one party. Two
observation suggest that should this happen, the penalty will not be large. First,
the choice of split is made by an independent (neutral) third party and thus should
be no more biased against one party than a random choice. Second, in the case
where there are no geometric constraints (see [6]), the absolute geometric target
and the k-split geometric target for Vout and Rwin can differ by at most 12 . In
our example, we see this difference between the k-split geometric target and the
absolute geometric target: for party A, in the 3-split in Vertical Split sequence, and
34 ZEPH LANDAU AND FRANCIS EDWARD SU
for party B, in the 3-splits on the final three split sequences of the protocol. In
each of these cases, however, the difference from the absolute geometric target is
as small as it could be: 12 . It is reasonable to assume that most splits will either
not particularly favor either party, or favor a party by a small amount. It is then
the augmentation step of the protocol that ensures that a rare “bad” split for a
particular party will not come into play (since the affected party would put such a
split towards the bottom of their rankings).
We see therefore, that the good choice property, when coupled with the aug-
mented protocol, implies that party A should be satisfied if the division is created
by an option that it chose.
We now turn to the second question—how party A will fare if the random-
ness in the algorithm results in a choice they did not prefer. The randomness is
implemented only if for each i-split, the two parties have opposite preferences (for
instance in the first split sequence described in Figure 4). We suppose the random
prescription in the Resolution Step (see Section 6.2) is one not preferred by party
A, for instance prescription (i.) in the Resolution step (i.e. option (1) for i = i0 ).
In our example, this would correspond to i0 = 2 in the first split sequence. Party
A, however, prefered option (2): to divide piece 2 and have party B divide piece
1. Notice, however, that party A would prefer to divide up piece 1 in the i0 + 1
split, and this piece 1 only differs from the piece 1 of the i0 split by a small region
with a population equal to the size of a single district. (Similarly piece 2 in the i0
and i0 + 1 splits only differ by this same small region). Because party A prefers
option (2) for the i0 split and option (1) for the i0 + 1 split (and because piece 1 of
these two splits do not differ by very much), it is reasonable to expect that party
A’s preference for option (2) over option (1) for the i0 split is mild. (In our exam-
ple, this is indeed the case as party A achieves an outcome of winning 2 districts
which is of minimal negative deviation from the absolute geometric target of 2.5).
FAIR DIVISION AND REDISTRICTING 35
If indeed this is the case, then party A’s discontent with the division would only
be mild as we have shown (by the good choice property) that party A would have
been satisfied with the slightly better option of (2).
Even though the first pieces of the two splits differ by a small amount, one can
construct scenarios where that small amount makes a big difference. However, recall
that the creation of the splits was done by an independent party and therefore one
would expect this type of scenario to be rare. Again, choosing to use the augmented
protocol would ensure that this rare scenario would not occur in the division chosen.
7. Conclusion
Replacing current redistricting procedures with the protocol presented here
surely presents substantial political obstacles. It has been observed numerous times
(e.g., see [5]) that any proposed change should be structured to take effect far
enough in the future so that it could not be interpreted as a power grab by one
party. However, as noted in the last paragraph of the introduction, some of the
ideas presented here could be incorporated into current processes without requiring
a complete overhaul of the redistricting process.
In this article, we have used a detailed example to explore the redistricting pro-
tocol of [6]. We have shown how this procedure retains the usual constraints that
may be desirable to impose on a redistricting solution, while incorporating some of
the best features of a fair division procedure: multilateral evaluation, procedural
fairness, and fairness guarantees. Procedural fairness is apparent in the protocol,
the geometric targets incorporate multilateral evaluation, and the ability to ensure
outcomes near geometric targets provides the fairness guarantee. The result is a so-
lution that accounts for both parties having different interests, involves a resolution
process and an interactive protocol to elicit preferences, and provides mathematical
confidence that the outcome will be fair.
References
[1] N. Apollonio, R. I. Becker, I. Lari, F. Ricca, and B. Simeone, The Sunfish against the Octopus:
opposing compactness to gerrymandering, in Mathematics and Democracy. Recent advances
in Voting Systems and Collective Choice, Studies in Choice and Welfare (2006), 19–41.
[2] Steven J. Brams, Michael A. Jones, and Christian Klamler, N -person cake-cutting: there
may be no perfect division, Amer. Math. Monthly 120 (2013), no. 1, 35–47, DOI
10.4169/amer.math.monthly.120.01.035. MR3007365
[3] Steven J. Brams and Alan D. Taylor, Fair division, Cambridge University Press, Cambridge,
1996. From cake-cutting to dispute resolution. MR1381896 (97c:90007)
[4] S. J. Brams and A. D. Taylor, The Win-Win Solution: Guaranteeing Fair Shares to Every-
body, New York: W. W. Norton, Inc., 1999.
[5] S. Hirsch and T. E. Mann, “For Election Reform, a Heartening Defeat”, New York Times,
Nov. 11, 2005, also at http://www.nytimes.com/2005/11/11/opinion/11mann.html.
[6] Z. Landau, O. Reid, and I. Yershov, A fair division solution to the problem of redistricting,
Soc. Choice Welf. 32 (2009), no. 3, 479–492, DOI 10.1007/s00355-008-0336-6. MR2472256
(2011e:91123)
[7] H. Moulin, Fair Division and Collective Welfare, MIT Press, 2003.
[8] Office of the Clerk, U.S. House of Representatives website:
http://clerk.house.gov/members/electionInfo/elections.html
[9] Elisha Peterson and Francis Edward Su, Four-Person Envy-Free Chore Division, Math. Mag.
75 (2002), no. 2, 117–122. MR1573596
[10] J. Rawls, A Theory of Justice, Harvard University Press, 1971.
36 ZEPH LANDAU AND FRANCIS EDWARD SU
[11] Redistricting Law 2000. Denver, CO: National Conference of State Legislatures, 1999. Avail-
able at
http://www.senate.leg.state.mn.us/departments/scr/redist/
red2000/red-tc.htm
[12] Jack Robertson and William Webb, Cake-cutting algorithms, A K Peters Ltd., Natick, MA,
1998. Be fair if you can. MR1643406 (99j:90025)
[13] B. I. Spector, Analytical Support to Negotiations: An Empirical Assessment, Group Decision
and Negotiation 6 (1997), 421–436.
[14] H. Steinhaus, The problem of fair division, Econometrica 16 (1948), 101–104.
[15] Francis Edward Su, Rental harmony: Sperner’s lemma in fair division, Amer. Math. Monthly
106 (1999), no. 10, 930–942, DOI 10.2307/2589747. MR1732499 (2000k:91027)
1. Introduction
Mahendra Prasad [17] recently proposed an extension of the Condorcet jury
theorem (CJT) to voting on multiple proposals, wherein each proposal has a prob-
ability of being right, and each voter believes that every proposal is either right
or wrong.1 He argues that the proposal most likely to be right will be chosen by
approval voting (AV). His paper includes both an overview of the classical politi-
cal theory literature relating to normative social choice—especially the writings of
Condorcet and Rousseau—and a survey of modern social choice theory that extends
the CJT to multiple proposals.
In this paper, we assume that there are multiple proposals on a ballot; as in
a referendum with several propositions that voters can support or oppose, more
than one proposal can be approved. Although our world is black and white—a
proposal is either right or wrong, and a voter’s judgment about it is either correct
or incorrect—we embed it in a probabilistic framework, wherein each proposal has
a probability of being right, and each voter has a probability of correctly judging its
2014
c American Mathematical Society
37
38 STEVEN J. BRAMS AND D. MARC KILGOUR
state.2 A proposal’s state, and a voter’s judgment about it, are assumed, initially,
to be independent.
The paper proceeds as follows. In section 2, we prove that AV in expectation
chooses those proposals mostly likely to be right if and only if the average probabil-
ity that a voter is correct about the state of a proposal is greater than 12 (Theorem
1).
In section 3, we assume that the probability that a voter is correct depends on a
proposal’s state—whether it is right or wrong. We then show that AV chooses those
proposals most likely to be right if and only if the sum of the average probabilities
that a voter is correct about right and wrong proposals is greater than 1 (Theorem
2), which is a refinement of Theorem 1.
In section 4, we prove a negative result: AV does not always choose the proposal
most likely to be right when the probability that a voter is correct depends on the
proposal (Theorem 3). But in section 5 we show that if the average probability
that a voter is correct, and the probability that a proposal is right, are functionally
related in a certain way, the proposals that receive the most votes are most likely
to be right (Theorem 4), echoing Theorems 1 and 2.
In section 6, we ask the following question: If all voters follow a leader who has
an above-average probability of correctly judging whether a proposal is right, is their
aggregated judgment better than when they vote independently? It turns out that
it is—in the sense that AV better distinguishes right from wrong proposals—if the
probability that proposals are right is never less than 12 (Theorem 5). Surprisingly,
however, voters who make independent judgments may have a greater probability of
selecting the right proposals than following a leader, showing that different measures
of the “rightness” of decisions can diverge (Theorem 6).
In section 7, we discuss applications of our results to different kinds of elections,
pointing out that the deliberations of committees—including the one that debated
US options in the 1962 Cuban missile crisis (EXCOM)—probably best approximate
the use of AV. AV is also applicable to referendums with multiple propositions,
wherein voters may approve of more than one.
In section 8, we relate our results to the CJT. The CJT concerns a single
proposal and states that if (i) each voter has the same probability, greater than
1
2 , of being correct, and (ii) voters’ judgments of correctness are independent, then
the probability that a majority of voters is correct approaches 1 as the number of
jurors approaches infinity.3
Unlike the CJT, Theorems 1-6 do not posit a quota, such as a simple majority,
but instead answer the question of which, among multiple proposals, are most likely
to be right. We show under what conditions a proposal’s AV total can be interpreted
as a measure of its probability of being right. We also consider the possibility of
strategic voting.
In section 9, we summarize our results for juries that must weigh multiple
charges or counts, legislatures that must decide among multiple bills or amendments
to bills, and elections with multiple candidates. In these very different settings, the
most approved choices tend to be those with the highest probabilities of being right.
2 In contradistinction to Prasad [17], who assumes that proposals are either right or wrong
with certainty, we assume that their rightness is probabilistic, making Prasad’s deterministic
assumption a special case.
3 For background and references on the CJT, see “Condorcet Jury Theorem,” [8]; additional
These two scenarios illustrate how, in a criminal trial, a juror might be faced
with more than two choices. Our subsequent analysis applies to voting under both
scenarios, making it immaterial whether the p(i)’s sum to 1 or not.
There are two ways that voter j can decide to vote for proposal i: either (i)
proposal i is right, and voter j judges it correctly, which has probability p(i)q(j);
or (ii) proposal i is wrong, and voter j judges it incorrectly, which has probability
[1 − p(i)][1 − q(j)]. We wish to calculate the expected number of approval votes for
proposal i, AV (i), and the expected number of approval votes per voter, av(i) =
AV (i)
n .
The theorem that follows depends
on the average probability that a voter is
correct about a proposal, q̃ = n1 nj=1 q(j). Note that, for now, this probability is
assumed to be the same for all proposals. As we show next, if this average is high
enough, then proposals that are more likely to be right are guaranteed to receive,
in expectation, more approval votes.
Theorem 2.1. For any two proposals, i1 and i2 , the statement that
av(i1 ) > av(i2 ) if and only if p(i1 ) > p(i2 )
is true if and only if q̃ > 12 .
Proof. As noted above, the probability that voter j votes for proposal i is
p(i)q(j) + [1 − p(i)][1 − q(j)]. It follows that the expected number of approval
votes by voter j for proposal i is also p(i)q(j) + [1 − p(i)][1 − q(j)]. Summing this
expectation over all voters yields the expected number of approval votes received
by proposal i:
n
(2.1) AV (i) = {p(i)q(j) + [1 − p(i)][1 − q(j)]} .
j=1
Multiplying out the summand and rearranging the terms of (2.1) yields
⎡ ⎤
n
n
(2.2) AV (i) = n − q(j) + p(i) ⎣2 q(j) − n⎦ .
j=1 j=1
p(i1 ) > p(i2 ) and, as Theorem 2.1 guarantees, av(i1 ) > av(i2 ), there is some proba-
bility that an unlikely event occurs and, for example, the actual vote for i2 exceeds
the actual vote for i1 . However, by the law of large numbers, the probability of such
a reversal approaches zero as the number of voters, n, becomes large. Moreover,
this reversal probability diminishes as the gap between av(i1 ) and av(i2 ) increases,
helping to ensure that the proposal most likely to be right is chosen by AV.
3. State Dependence
Assume next that the probability that voter j is correct about a proposal is not
a constant, q(j), but, instead, depends on whether the proposal is right or wrong:
• If proposal i is right, then voter j will judge it correctly with probability
qr (j);
• If proposal i is wrong, then voter j will judge it correctly with probability
qw (j).
n n
Define q˜r = n1 j=1 qr (j) and q˜w = n1 j=1 qw (j) to be the average probabilities
that a voter makes a correct judgment about, respectively, right and wrong pro-
posals. This assumption of state dependence produces an extension of Theorem
2.1:
Theorem 3.1. For any two proposals, i1 and i2 , the statement that
av(i1 ) > av(i2 ) if and only if p(i1 ) > p(i2 )
is true if and only if q˜r + q˜w > 1.
Proof. We rewrite (2.1), replacing q(j) by qr (j) if the proposal, i, being eval-
uated by voter j is right, and by qw (j) if this proposal is wrong:
n
AV (i) = {p(i)qr (j) + [1 − p(i)][1 − qw (j)]} .
j=1
4. Proposal Dependence
In section 2, we assumed that the probability that a voter correctly judges
a proposal depends only on the voter and not on the proposal. In section 3, we
assumed that this probability can depend on the proposal, but only insofar as the
true state is right (r) or wrong (w). Thereby we replaced q(j) with two probabilities,
42 STEVEN J. BRAMS AND D. MARC KILGOUR
qr (j) and qw (j), that were in general different. If, for example, voter j is better at
correctly judging proposals that are right than those that are wrong, qr (j) > qw (j).
In this section, we assume that a voter’s ability to judge a proposal may be
different for every proposal, even those in the same state of rightness or wrongness.
Specifically, we assume that the probability that voter j’s judgment is correct about
proposal i is q(i, j), a function of i as well as j.
Let Q(i) = nj=1 q(i, j). Then q(i) = n1 Q(i) is the average probability that a
voter is correct about proposal i. Now—contra Theorems 2.1 and 3.1—we obtain
a negative result about the correctness of the voters’ judgments about competing
proposals.
Theorem 4.1. If voters’ probabilities of correct judgment are proposal depen-
dent, then, even if q(i) > 12 for all values of i, it is possible for two proposals, i1
and i2 , to satisfy av(i1 ) < av(i2 ) and p(i1 ) > p(i2 ).
Proof. In (2.1), we replace q(j) with q(i, j) to obtain
n
AV (i) = p(i)q(i, j) + [1 − p(i)][1 − q(i, j)]
j=1
4 While differentiation of (2.3) and (3.2) with respect to p(i) would cause these terms to
vanish, this is not the case for q(i) in (4.2) unless it is known that there is no functional relation
between q(i) and p(i).
WHEN DOES APPROVAL VOTING MAKE THE “RIGHT CHOICES”? 43
Of course, if the average probability that a voter is correct does not depend
on the proposal, that is, if q(i) = q for all i, then Theorem 2.1 applies. When this
is the case, as we showed in section 2, there is a positive association between the
approval votes for a proposal and the probability that it is right. Are there other
kinds of dependence in which this positive association holds?
5 Because p(i) and q(i) are assumed equal and, therefore, do not depend on i (their values
with the same value of p also have the same value of q. Moreover, a small change in the value of
p must correspond to a small change in the value of q.
44 STEVEN J. BRAMS AND D. MARC KILGOUR
It is easy to link Theorem 5.1 with Theorem 2.1: When q(p) = q is constant,
f (p) = (2p−1)(2q−1) is an increasing function of p if and only if q > 12 , in which case
the subinterval is [0, 1]. In the case of q(p) = p discussed earlier, f (p) = (2p − 1)2 ,
and it is easy to verify that the conditions of Theorem 5.1 are satisfied for the
subinterval [ 12 , 1]—that is, if and only if p ≥ 12 . A parallel example is q(p) = 1 − p,
in which case f (p) = −(2p − 1)2 , so the subinterval is [0, 12 ].
Many other examples could be constructed. The most realistic, we think, are
those in which q(p) is monotonically increasing in p, but not necessarily linearly. For
example, q(p) may increase slowly near p = 12 , but then rapidly as p approaches 1, if
the proposals most likely to be right are much more likely to be judged correctly. To
aid in the construction of such examples, we note that Theorem 5.1 is equivalent to
dq 1−2q dq 1−2q
the conditions that dp > 2p−1 whenever p > 12 ; q > 12 when p = 12 ; and dp < 2p−1
whenever p < 12 .
To summarize, when voters’ probabilities of being correct depend on the pro-
posal being considered, AV does not necessarily single out the proposals most likely
to be right (Theorem 4.1). However, if the average voter’s probability of being cor-
rect is a differentiable function of the probability of the proposal’s being right, then
Theorem 5.1 provides a condition on this function that ensures that the expected
number of approval votes of a proposal reflects the probability that that proposal
is right.
6. Follow-the-Leader
For convenience, we henceforth assume that a voter’s judgment is equally
good—on any proposal, whether it is right or wrong—rendering Theorem 2.1 ap-
plicable. In expectation, therefore, the proposal that receives the most approval
votes is the one with the greatest probability of being right if and only if q > 12 .
We next ask whether voters might improve the chance that a proposal most
likely to be right is selected if they all follow the advice of some leader, j = L. We
denote by avL (i) the average number of approval votes received by proposal i when
all voters follow L.
One might expect that follow-the-leader would be an especially good strategy
for selecting the proposal most likely to be right when q(L) > q, or L has an
above-average probability of judging proposals correctly. The next theorem shows
that this is indeed true—follow-the-leader surpasses the independent judgments of
voters in distinguishing candidates with the greatest probabilities of being right,
based on their AV totals. However, this result is complicated by an issue that we
will discuss shortly.
Theorem 6.1. If q(L) > 12 , then for two proposals i1 and i2 , avL (i1 ) > avL (i2 )
if and only if p(i1 ) > p(i2 ). Moreover, avL (i1 ) − avL (i2 ) > av(i1 ) − av(i2 ) if and
only if q(L) > q.
WHEN DOES APPROVAL VOTING MAKE THE “RIGHT CHOICES”? 45
the probability that independent judgments favor proposal 1 is 0.331. Hence, the
probability that follow-the-leader gives the correct decision is actually less than
independent judgments.
How can this be? Using the foregoing example, we formulate this result next.
Theorem 6.2. The proposal that is most likely to be right can be chosen less
frequently under follow-the-leader than under independent judgments.
Proof. Under follow-the-leader, proposal 1 beats proposal 2 if and only if L
approves of proposal 1 and does not approve of proposal 2, a judgment we code as
(1, 0). In fact, if L approves of proposal 1 and disapproves of proposal 2, so will F ;
hence, proposal 1 will defeat proposal 2 by 2-0.
The leader, L, approves of proposal 1 when
(1) L judges proposal 1 correctly (with probability 0.8), and proposal 1 is
right (with probability 0.9), which has a joint probability of 0.72.
(2) L judges proposal 1 incorrectly (with probability 0.2), and proposal 1 is
wrong (with probability 0.1), which has a joint probability of 0.02.
These probabilities sum to 0.74. Similarly, L disapproves of proposal 2 when
(1) L judges proposal 2 correctly (with probability 0.8), and proposal 2 is
wrong (with probability 0.2), which has a joint probability of 0.16.
(2) L judges proposal 2 incorrectly (with probability 0.2), and proposal 2 is
right (with probability 0.8), which has a joint probability of 0.16.
These probabilities sum to 0.32.
It follows that the probability that L’s judgment is (1,0) is the product, 0.74 ×
0.32 ≈ 0.237. Of course, in the follow-the-leader model, F follows L, so proposal 1
defeats proposal 2 under follow-the-leader with probability 0.237. By comparison,
L’s judgment is (1,1) with probability 0.503, (0,1) with probability 0.177, and (0,0)
with probability 0.083. In the latter three cases, of course, proposal 1 does not
defeat proposal 2 under follow-the-leader.
These possibilities exhaust the approval/disapproval choices of L, so their prob-
abilities necessarily sum to 1. Note that the most likely event is (1,1), occurring
more than half the time (0.503), in which L (and F ) approve of both proposals and
thereby create a tie between them.
We next show, suppressing some details, that independent judgments give a
higher probability that proposal 1 will defeat proposal 2 than does follow-the-leader,
even though L is more likely to be right than F (0.8 vs. 0.6) and is therefore a
better-than-average judge. Proposal 1 can defeat proposal 2 in three different ways,
shown in the first column of the following table:
The leader (L) and follower (F ) columns code the judgments of L and F that
give rise to the vote combinations to the left. For example, (2,1) occurs if both L
and F approve of proposal 1 but only L approves of proposal 2. We have calculated
the probabilities, in a manner analogous to that for L in the case of (1,0) under
follow-the-leader, in the fourth column for independent judgments. They sum to
0.331, which is 40 percent higher than the probability of 0.237 that we found for
follow-the-leader.7
7 The divergence between the probability that a proposal is right and its expected approval
vote is mirrored in other realms. For example, the probability of a net gain from a sequence of
bets, on the one hand, and the expected amount from that gain on the other, may be at odds.
Davis ([9], pp. 23-29) gives the example of a seemingly profitable bet in which you win twice
WHEN DOES APPROVAL VOTING MAKE THE “RIGHT CHOICES”? 47
Because in our example there are only two proposals and two voters—each
with a high probability of correctly judging the proposals—there is a substantial
probability of ties: 59 percent with follow-the-leader, 40 percent with independent
judgments. Although we focused on the situation in which the proposal more likely
to be right (proposal 1) receives strictly more support than the other proposal
(proposal 2)—making proposal 1 the “winner”—in situations such as referendums
with multiple propositions on the ballot, more than one proposition might win (e.g.,
with majority approval).
If the number of voters is large, ties become highly unlikely with indepen-
dent judgments; with follow-the-leader, of course, they never occur, because all the
voters vote alike, regardless of how many there are. While follow-the-leader and
independent judgments will give a similar, if not the same, ranking of proposals,
there is an important difference in how they determine rankings.
In comparing two proposals, follow-the-leader makes an error (judging a right
proposal to be incorrect, judging a wrong proposal to be correct) whenever the
leader makes an error; this error rate does not decrease as the number of voters
increases. On the other hand, the error rate does decrease as number of voters
increases in the case of independent judgments, and in general yields a number of
approval votes very close to the expected number if the number of voters is large.
As long as there is some difference in the expected number of approval votes that
different proposals receive, a large enough electorate will reliably distinguish better
from worse proposals under both follow-the-leader and independent judgments.
Whereas follow-the-leader is superior at drawing this distinction, its dependence
on all-or-nothing votes gives it a higher error probability in selecting the right
proposal(s). As with the Condorcet Jury Theorem (CJT)—to which we will later
compare our results—having a large number of voters tends to produce the right
outcome, however votes are aggregated or whatever the decision rule is for choosing
a winner.
your bet with probability 12 and lose just your bet with probability 12 . If you start with $100 and
always bet 75 percent of your capital, after only two bets your expected winnings are $189.06,
but 75 percent of the time you end up with less than $100. To maximize the rate of growth
of your capital, it turns out, you should bet a fraction chosen not to maximize your expected
winnings, but rather their logarithm, which is known as Kelly betting. This is a so-called Markov
betting system, because the outcome depends only on your present bankroll and your probability
of winning ([10], pp. 60-62).
48 STEVEN J. BRAMS AND D. MARC KILGOUR
7. Applications to Politics
At the beginning of their deliberations in a case, jurors are often divided
on a verdict. But, typically, they move toward a decision—unanimous or near-
unanimous, as specified by the rules that govern the case—making hung juries
relatively rare. On average, about 10 percent of all cases result in hung juries [13].
During their deliberations, jurors will often be persuaded by the juror who offers
the most persuasive arguments, whom we assume has an above-average probability
of being correct. Assume that this juror is L, and for definiteness assume that
q(L) = max{q(j) : j = 1, 2, . . . , n}—that is, q(L) is the maximum value of q(j).
If all jurors follow L, we showed in section 6 that, provided q(L) > q > 12 , the
proposals with the greatest p’s—the ones most likely to be right—will in expectation
garner the most approval votes. Nonetheless, the probability that a proposal with
a greater value of p is selected may actually increase if the jurors exercise their own
independent judgments.
This argument against following the lead of L is made by opponents of “group-
think” [14], in which independent thinking is suppressed in favor of achieving a
group consensus, often leading to poor decisions. On the other hand, if we assume
that the average juror is genuinely persuaded by L, where q(L) > q > 12 , inde-
pendent thinking will not be suppressed but, instead, be replaced by the superior
thinking of L, based on the more persuasive arguments L offers compared with
those offered by other jurors.
Of course, if L’s arguments persuade jurors to support proposals that are more
likely to be wrong than right (i.e., p(i) < 12 ), then follow-the-leader will have a
perverse effect. But this will not be true if q(L) > q > 12 , in which case follow-the-
leader will draw a sharper distinction than independent judgments between better
and worse proposals, though independent judgments may maximize the probability
that the better proposal will be chosen.
Our model is applicable to groups other than juries. As a case in point, consider
the deliberations of EXCOM, the executive committee of high-level government and
other officials who debated options that the United States might choose during the
Cuban missile crisis of October 1962 ([2], pp. 226-240, and references therein).
Although EXCOM members at the outset leaned toward an air strike against the
Soviet missiles in Cuba, most of its members were persuaded in the end to rec-
ommend to President John Kennedy the less aggressive action of a naval blockade
(called, euphemistically, a “quarantine” at the time) and only consider more ag-
gressive action if the blockade failed to induce the Soviets to withdraw their missiles
from Cuba.
In the deliberations of EXCOM, Robert Kennedy, the attorney general and
brother of President Kennedy, seems to have fulfilled the role of L. He warned that
an air strike would be seen as “a Pearl Harbor in reverse, and it would blacken the
name of the United States in the pages of history” ([18], p. 684). To be sure, the
fact that Robert Kennedy and his supporters were successful in persuading other
EXCOM members to support a blockade cannot be taken as conclusive evidence
that follow-the-leader will always succeed, but it does illustrate one instance in
which persuasion seems to have abated a major political-military crisis, leading to
its peaceful resolution.
In democracies, political parties and their candidates put forward proposals
to solve problems and advance their positions; suppose that associated with each
WHEN DOES APPROVAL VOTING MAKE THE “RIGHT CHOICES”? 49
proposal is a probability of its being right, or at least providing some remedy. Are
the proposals selected (including the status quo) the ones most likely to be right?
Our model is inapplicable to legislatures and other voting bodies wherein pro-
posals come up one at a time and then are voted up or down. Because voting is
sequential in these bodies, voters cannot approve or disapprove, simultaneously, of
multiple proposals. In such settings, the ordering of proposals (e.g., amendments to
a bill) that are voted on can, for strategic reasons, critically affect the support they
receive, so their votes do not provide an accurate gauge of their degree of sincere
support.
In elections in which there are multiple candidates on a ballot, usually a choice
of only one candidate is possible. Even if the voter is permitted to rank the can-
didates, this ranking does not say where the voter would draw the line between
approved and disapproved candidates, though systems have been proposed that
would allow this ([1], ch. 3; [7]). Besides juries that consider multiple charges,
or committees like EXCOM that deliberate over multiple strategies, referendums
with multiple propositions on the ballot come closest to fitting the AV model. The
propositions can be considered proposals, and voters can approve of more than one.
Usually a simple majority determines which propositions pass. If, however, two
or more propositions contradict each other, and each gets a majority of votes—as
can happen—the usual rule is that the proposition with the most votes is enacted.
Because this is the proposition most likely to be right according to our model, this
rule is consistent with passage of those (noncontradictory) propositions most likely
to be right.8
In both jury/committee settings and referendums, voters typically follow differ-
ent leaders, who may espouse different positions. The question our analysis raises is
whether the leader who persuades the most voters to approve of his or her favored
proposal helps the proposal most likely to be right.
Because there is not usually a single L but, instead, multiple leaders who take
different positions on proposals, one must be careful how to define “right.” Pre-
viously, we defined p(i) to be the probability that proposal i is right (or good or
just).
But suppose that there are two leaders, one of whom supports proposal i and
the other of whom opposes it. Assume that all voters support the positions of one
of the two leaders. Then if we interpret p(i) to be the probability that the supporter
of proposal i is right, and 1 − p(i) to be the probability the opponent is right, then
AV will choose the proposal with the higher probability of being right.
While this interpretation of our model certainly applies to multiple propositions
in a referendum9 —in which one can approve or not approve of each—how does it
apply to elections with multiple candidates? We suggest that a useful way to
think about candidates who take positions on multiple proposals is as composites
of positions. Under AV, the voter who approves of one or more candidates is saying,
8 For analyses of different ways of aggregating votes in referendums with multiple propositions,
quorum, abstention has no effect, but if a minimum percentage (e.g., 50) of the electorate must
participate to allow for the passage of a proposition, then if this minimum is not achieved, it
seems reasonable to interpret nonenactment as the right choice, even if the proposition receives
majority support of those who participated. This is because the failure to achieve a quorum can
be deemed as insufficient support to make a choice binding on the electorate.
50 STEVEN J. BRAMS AND D. MARC KILGOUR
in effect, that he or she approves of their composite positions—at least more so than
the composite positions of other candidates that fail to receive his or her approval.
In this interpretation, the p(i)’s are associated with each candidate i, who
represents a composite of positions on what we earlier called proposals—the issues
of the day in an election. But are the candidates who receive the most approval
the ones whose composites of positions are the ones most likely to be right?
In the context of elections, “appealing” might be a better word to use than
“right,” because there is usually no right or wrong position, or composite of posi-
tions, as such (unlike the guilt or innocence of a defendant in a criminal trial). But
if we associate the appeal or popularity of a candidate with his or her being the
right choice, then AV will make the right choice in elections.
To be sure, the “people’s choice” in such elections is not what many political
philosophers, at least since Plato, would consider the right choice. But if the popular
will—even if it does not always mirror the ideal of Rousseau’s general will—is the
cornerstone of democracy, then it is appropriate to consider it synonymous with
the right choice in elections.10
but with the qualifying phrase that all voters, even if they have conflicting interests, are “fully
informed.” But if voters can be seduced (as opposed to persuaded by logic and reason) by populist
or demagogic appeals, then it is dubious to equate appeal with rightness.
11 Nitzan [16] (pp. 205-207) shows under rather general conditions when simple-majority
rule gives a higher probability that a proposal is judged correctly than the “expert rule,” which
is follow-the-leader when the leader is the voter with the greatest probability of being correct.
12 An “extended” CJT ensures that if an average juror has a probability of being correct that
is greater than 12 , the probability that a jury will make the right decision approaches a value that
is a function of e, and is strictly less than 1 [12].
13 This is true without regard to the values of the p(i)’s, which may happen, for example, if
a company will surely fail if it does nothing. However, if there is some less-than-even chance of
success if it takes some risky action, then the most approved action, even if it will probably fail,
is still better than doing nothing.
WHEN DOES APPROVAL VOTING MAKE THE “RIGHT CHOICES”? 51
vote. In order for him or her to select the proposal most likely to be right, a voter
must be able to identify the degrees of rightness of each proposal i, as given by
p(i), in order to choose the one most likely to be right. But we assume that the
p(i)’s are unknown to the voters; in the absence of this knowledge, the aggregation
of plurality votes need not single out the proposal most likely to be right, even if q
is high.
To get plurality voting to choose the proposal most likely to be right, vot-
ers’ judgments about proposals would need to be conditioned on each proposal’s
rightness. But even for approval votes, as we showed in Theorem 4.1, this creates
problems. Only when the voters’ average probability of judging a proposal cor-
rectly is functionally related (in an appropriate manner) to the probability that the
proposal is right (Theorem 5.1) can the approval votes of proposals reflect their
probabilities of being right.
Our model assumes that voters respond to a signal—based on q(j) or perhaps
q(L)—that they receive on each proposal i; this proposal has a probability, p(i), of
being right. Might voters do better responding strategically rather than sincerely?
To inquire about strategy presumes that voters have preferences over outcomes,
which the q(j)’s in our model do not assume. However, if one makes this assumption
(see [11] and references therein), jurors can do better by conditioning their decisions
on their probabilities of being pivotal, which will depend on both the decision rule
and how other jurors vote. Thus, for example, if a verdict requires unanimity
in a criminal trial, then a juror will be pivotal if and only if all the other jurors
vote to convict or acquit, making his or her vote decisive either in convicting or in
acquitting the defendant.
But when there are more than a few voters, a voter’s pivotalness becomes
less meaningful as a basis for making a choice. Indeed, the voter’s probability
of being decisive becomes negligible as the number of voters becomes larger and
larger. Moreover, under AV, the question is less one of making the right choice on
a single proposal and more one of where to draw the line between acceptable and
unacceptable proposals, as analyzed in [3] and [4].
In the present model, voters seem well advised to make their own best judg-
ments about proposals, either according to q(j) or by following a leader according
to q(L). To deviate from these signals, voters—or the leaders whom they follow—
would need to have information, which we do not assume, that there is at least
the potential to produce more right choices by ignoring or countermanding their
signals. Unless the strategic environment provides voters with the opportunity to
obtain this information, it seems reasonable to assume that they will be sincere.
True, if voters follow different leaders, then the strategic situation changes—a
competitive election is no longer just a search for right choices. For example, a
leader may advise a voter not to vote for a candidate for whom the voter receives
a favorable signal, lest this candidate beat a candidate preferred by the leader. On
the other hand, if multiple proposals can be approved, as in a referendum, and
supporting one proposal does not affect the choice of another, strategic voting is
not an issue.
9. Conclusions
We have shown that the most approved proposals will be those with the greatest
probability of being right if and only if the average probability that the judgment
52 STEVEN J. BRAMS AND D. MARC KILGOUR
of a voter is correct exceeds 12 (Theorem 2.1). This necessary and sufficient condi-
tion allows some voters to have probabilities of being correct that are less than 12 ,
provided they are counterbalanced by voters who raise the average above 12 .
Although Theorem 2.1 and the subsequent theorems bear some similarity to
the CJT, their differences are substantial. First, except for Theorems 4.1 and
5.1, the theorems assume that proposals have probabilities of being right that are
independent of the judgments of voters. Second, there is not a single proposal but
multiple proposals, all of which may have varying degrees of rightness.
While the most approved proposals under AV will be the ones most likely to be
right in most circumstances, this may not true under plurality voting. The reason
is that a voter, not knowing the p(i)’s, must cast his or her single vote on the basis
of his or her q(j) alone, which does not distinguish among proposals. Under AV,
however, all voters vote with some positive probability for all the proposals; except
for Theorem 4.1, our theorems ensure that the expected number of approval votes
is greater for the proposals more likely to be right.
More specifically, this is true not only if proposal probabilities and voter prob-
abilities are based on independent events but also if the probability that a voter
makes a correct judgment about a proposal depends on its state (i.e., whether it is
right or wrong, as shown in Theorem 3.1). While this is not generally true if voter
probabilities depend upon the proposal being considered (Theorem 4.1), approval
votes track the rightness of proposals if the average probability that a voter is cor-
rect, and the probability that a proposal is right, are functionally related in certain
ways (Theorem 5.1).
So does follow-the-leader, if the leader has an above-average probability of being
correct, which sometimes—but not always—may be preferable to voters’ making
independent judgments (Theorem 6.1). This may be one reason why defendants,
who think their case is strong, sometimes prefer that their case be heard by a judge
with a high q(j) than a jury with a lower q. However, when the number of voters
is small, as in a committee, the independent judgments of its members may more
often lead to the right decision (Theorem 6.2), illustrating the divergence between
the probability that a collective choice is right and its expected approval vote.
AV is most applicable to situations in which there are multiple alternatives
that voters must choose among, such as criminal charges in a trial, proposals in
a committee, or candidates in an election, all of which have some probability of
being right (i.e., p(i) > 0). We have shown that AV is well suited to finding the
best—the most likely to be right, good, or just—among them, although strategic
considerations may interfere if there are multiple leaders contesting elections.
We conclude on a note of caution. Our results on selecting the proposals most
likely to be right—except for calculating the probability that a 2-person commit-
tee makes the right decision in section 6—are based on the expected approval of
these proposals, which will not always be realized in practice. Especially if the elec-
torate is small, random variability may occasionally imply that the most approved
proposals are not be the ones most likely to be right. As the electorate increases
in size, however, the correctness of choices becomes more and more certain under
AV—without the need, à la the CJT, to assume that every voter is better than a
random coin toss.
WHEN DOES APPROVAL VOTING MAKE THE “RIGHT CHOICES”? 53
References
[1] Steven J. Brams, Mathematics and democracy: Designing better voting and fair-division
procedures, Princeton University Press, Princeton, NJ, 2008. MR2382290 (2009b:91003)
[2] Steven J. Brams, Game theory and the humanities: Bridging two worlds, MIT Press, Cam-
bridge, MA, 2011. MR2789313 (2012i:91085)
[3] Brams, Steven J. and Peter C. Fishburn, “Approval Voting,” American Political Science
Review 72 (1978), 831–847.
[4] Steven J. Brams and Peter C. Fishburn, Approval voting, 2nd ed., Springer, New York, 2007.
MR2301539 (2007k:91081)
[5] Brams, Steven J., D. Marc Kilgour, and William S. Zwicker, “Voting on Referenda: The
Separability Problem and Possible Solutions,” Electoral Studies 16 (1997), 359–377.
[6] Brams, Steven J., D. Marc Kilgour, and William S. Zwicker, “The Paradox of Multiple Elec-
tions,” Social Choice and Welfare 15 (1998), 211–236.
[7] Brams, Steven J. and R. Remzi Sanver, “Voting systems that combine approval and prefer-
ence”, In Steven J. Brams, William V. Gehrlein, and Fred S. Roberts (eds.), The Mathematics
of Preference, Choice, and Order: Essays in Honor of Peter C. Fishburn, Berlin: Springer
(2009), 215-237, DOI 10.1007/978-3-540-79128-7 12. MR2648304 (2011i:91006)
[8] Wikipedia, Condorcet Jury Theorem (2011), http://en.wikipedia.org/wiki/Condorcet’s
jury theorem
[9] Davis, Morton D., The Math of Money: Making Mathematical Sense of Your Personal Fi-
nances, New York: Springer (2001).
[10] Richard A. Epstein, The theory of gambling and statistical logic, 2nd ed., Elsevier/Academic
Press, Amsterdam, 2009. MR2549459 (2011b:91083)
[11] Feddersen, Timothy and Wolfgang Pesendorfer, “Elections, information aggregation, and
strategic voting”, Proc. Natl. Acad. Sci. USA 96 (1999), no. 19, 10572–10574 (electronic),
DOI 10.1073/pnas.96.19.10572. MR1712531
[12] Grofman, Bernard and Guillermo Owen, Review Essay: Condorcet Models, Avenues for Fu-
ture Research, In Bernard Grofman and Guillermo Owen (eds.), Information Pooling and
Group Decision Making: Proceedings of the Second University of California, Irvine, Confer-
ence on Political Economy. Greenwich, CT: JAI Press (1986), 93–1922.
[13] Hannaford-Agor, Paula L., Valerie P. Hans, Nicole L. Mott, and G. Thomas Munsterman,
Are Hung Juries a Problem?, Washington, DC: National Institute of Justice (2002).
[14] Janis, Irving L., Victims of Groupthink: A Psychological Study of Foreign-Policy Decisions
and Fiascoes, Boston: Houghton-Mifflin (1972).
[15] Miller, Nicholas R., Information, Electorates, and Democracy: Some Extensions and Inter-
pretations of the Condorcet Jury Theorem, In Bernard Grofman and Guillermo Owen (eds.),
Information Pooling and Group Decision Making: Proceedings of the Second University of
California, Irvine, Conference on Political Economy. Greenwich, CT: JAI Press (1986), 173–
192.
[16] Nitzan, Schmuel, Collective Preference and Choice, Cambridge, UK: Cambridge University
Press (2010).
[17] Prasad, Mahendra, “A Multiple Alternatives Extension of Condorcet Jury Theorem Using
Approval Voting”, Department of Political Science, University of California, Berkeley, (2011).
[18] Sorensen, Theodore C., Kennedy, New York: Harper and Row (1965).
1. Introduction
Consider a group of voters who must collectively decide the truth values of
some finite set of propositions. If the propositions were logically independent, then
their truth values could be decided separately through majority vote. However,
if the propositions are logically interconnected, then majority vote will often lead
to a logically inconsistent outcome [Con85, Gui52, KS86, LP02, NP10]. This
observation has motivated two lines of inquiry: first, to understand how pervasive
and severe this problem of collective logical inconsistency can be, and second, to
find some way to aggregate the views of the voters in a logically consistent manner.
These two lines of inquiry form the subject of judgement aggregation [LP09, LP10,
Mon12].
We will refer to a system of logically interconnected propositions as an aggrega-
tion space, and an assignment of truth values to the propositions as a view.1 Each
voter has a logically consistent view; the problem is to obtain a consistent collective
view. As we have noted, the “majority view” may be inconsistent. One way to re-
solve this problem is to look for a view which is “as majoritarian as possible,” while
still respecting the constraint of logical consistency. In other words, if we cannot
2014
c American Mathematical Society
55
56 K. NEHRING AND M. PIVATO
have a logically consistent view which agrees with the majority on all propositions,
then we seek a logically consistent view which agrees with the majority in a max-
imal subset of propositions. The set of all views with this property is called the
Condorcet set.
Another way to resolve the problem of majoritarian inconsistency is for the
voters to apply majority vote to the different propositions one at a time, but with
the recognition that earlier decisions act as logical constraints on later decisions.
This can be seen as an abstract description of how many actually existing policy
frameworks have evolved over time through a process of accretion. Such sequential
majority voting is logically consistent by construction. But depending upon the
order in which the propositions are decided, the group may converge on different
outcomes. In other words, this process may be subject to path-dependence.
In two previous papers, jointly with Clemens Puppe, the authors have inves-
tigated the structure of the Condorcet set and the nature and extent of path-
dependence [NPP14, NPP13]. These two issues are closely related, because it
turns out that the Condorcet set is precisely the set of views which can be reached
through sequential majority voting. The paper [NPP14] showed that many aggre-
gation spaces exhibit a particularly thoroughgoing form of path-dependence, called
global indeterminacy; this means that the truth value of every proposition depends
on the path taken.
The present paper extends this exploration. Section 2 introduces notation and
terminology. Section 3 shows that the global indeterminacy identified in [NPP14]
is not an improbable or “fluke” occurence; it arises naturally in aggregation spaces
which are large enough or have enough internal symmetries. Furthermore, for many
important aggregation spaces, it is in fact quite a robust phenomenon, which can
be generated even with a very small population of voters. Section 4 explores the
most extreme possible form of path-dependence, full indeterminacy. This means
that any possible view can arise as the outcome of some path. Section 5 focuses
on a particular necessary condition for such full indeterminacy. Full indeterminacy
is relatively rare. But Section 6 shows that many classes of aggregation spaces are
“almost” fully indeterminate, in the sense that they asymptotically approach full
indeterminacy as the number of propositions becomes large. All proofs are in the
Appendix.
2. Preliminaries
Let K be a finite set of propositions or issues. An element x ∈ {0, 1}K is called
a view, and interpreted as an assignment of a truth value of ‘true’ (1) or ‘false’ (0)
to each proposition in K.2 Not all views are feasible, because there will be logical
constraints between the propositions (determined by the structure of the underlying
decision problem faced by society). Let X ⊆ {0, 1}K be the set of ‘admissible’ or
consistent views; we call X an aggregation space.
Example 2.1. (a) (Preferences) Let A = {a, b, c} be a set of three alterna-
tives. We can represent the space of all preference orders over A with the following
aggregation space. Let K := {“a b”, “b c”, “c a”}, and let X = {(1, 1, 0),
(1, 0, 1), (0, 1, 1), (0, 0, 1), (0, 1, 0), (1, 0, 0)}. Here, for example, (1, 1, 0) is the view
2 What we call a ‘view’ is conceptually equivalent to what other papers called a ‘judgment
that says a b and b c, but not c a —in other words, the preference order
a b c. The other five elements of X correspond to the other five preference
orders over A.
(b) (Committees) Suppose we must select a committee of exactly two people,
and the candidates for this committee are Alice, Bob, Chiara, Daoud, and Elise.
Let K = {A, B, C, D, E}, and let X = {(1, 1, 0, 0, 0), (1, 0, 1, 0, 0), (1, 0, 0, 1, 0),
(1, 0, 0, 0, 1), (0, 1, 1, 0, 0), (0, 1, 0, 1, 0), (0, 1, 0, 0, 1), (0, 0, 1, 1, 0), (0, 0, 1, 0, 1),
(0, 0, 0, 1, 1)}. Here, for example, the view (1, 0, 1, 0, 0) represents the committee
consisting of Alice and Chiara.
(c) (Choice along on a line) Suppose we must select some value x ∈ {0, 1, 2, 3, 4}.
Let K = {“x ≥ 1”, “x ≥ 2”, “x ≥ 3”, “x ≥ 4”}, and let X = {(0, 0, 0, 0), (1, 0, 0, 0),
(1, 1, 0, 0), (1, 1, 1, 0), (1, 1, 1, 1)}. Here, for example, the view (1, 1, 0, 0) asserts that
x ≥ 1, and x ≥ 2, but x
≥ 3, and x
≥ 4 —in other words, x = 2.
(d) (Truth functions) Let p, q, and r be three logical propositions, and let
s = p & q & r. Suppose we must assign truth-values to these four propositions in
a logically consistent way. Thus, K = {p, q, r, s} and X = {(0, 0, 0, 0), (0, 0, 1, 0),
(0, 1, 0, 0), (0, 1, 1, 0), (1, 0, 0, 0), (1, 0, 1, 0), (1, 1, 0, 0), (1, 1, 1, 1)}. ♦
paper is to investigate the multiplicity of solutions in the Condorcet set of a generic profile; thus,
we exclude the ‘nongeneric’ multiplicities which arise when μ k = 12 for some k ∈ K —i.e. we
confine our attention to profiles in Δ∗ (X ). (If the set of voters is large (respectively odd), then a
profile in Δ(X ) \ Δ∗ (X ) is highly unlikely (resp. impossible) anyways.)
58 K. NEHRING AND M. PIVATO
propositions encode the orderings between pairs of alternatives), this principle was
first advocated by Condorcet [Con85]. Thus, we will say that an element x ∈ X
is Condorcet admissible if there does not exist any y ∈ X such that M(x, μ)
M(y, μ). Let Cond (X , μ) ⊆ X be the set of Condorcet admissible elements; we
call this the Condorcet set. For instance, if X and μ are as in Example 2.2 above,
so that Maj(μ) = (1, 1, 1), then Cond (X , μ) = {(1, 1, 0), (1, 0, 1), (0, 1, 1)}, which
corresponds to the three preference orders a b c, c a b, and b c a.
For any x, y, z ∈ {0, 1}K , say that y is between x and z if, for all k ∈ K,
(xk = zk = 0) =⇒ (yk = 0) and (xk = zk = 1) =⇒ (yk = 1). Furthermore, y is said
to be properly between x and z if, in addition, x
= y
= z. (For example, (1, 1, 0) is
properly between (0, 1, 0) and (1, 1, 1).) For any x ∈ X and z ∈ {0, 1}K , write x z
if there exists no y ∈ X which is properly between x and z. (For example, if X
is the preference aggregation space from Example 2.1(a), then (1, 1, 0) (1, 1, 1),
whereas (0, 1, 0)
(1, 1, 1).) The following simple observation is Lemma 1.5 of
[NPP14].
Lemma 2.3. (a) If Maj(μ) ∈ X , then Cond (X , μ) = {Maj(μ)}.
(b) Otherwise, Cond (X , μ) = {x ∈ X ; x Maj(μ)}. In this case,
|Cond (X , μ) | ≥ 3.
For many interesting aggregation spaces, the Condorcet set will be multivalued
for some profiles. To make this precise, we must introduce some notation. Let
J ⊆ K and consider an element w ∈ {0, 1}J , which corresponds to a subset of
judgements on the issues in J . The set J is the support of w, denoted supp (w).
We define |w| := |J | to be the size of w. For any I ⊆ J define wI := (wi )i∈I , an
element of {0, 1}I . For any v ∈ {0, 1}I , we say v is a fragment of w (and write
v w) if wI = v —that is, if vi = wi for all i ∈ I. We say w itself is a forbidden
fragment for X if, for all x ∈ X , we have xJ
= w. Finally, w is a critical fragment
if it is a minimal forbidden fragment —that is, w is forbidden, and there exists no
proper subfragment v < w such that v is forbidden.4
Example 2.4. (a) Let X and K be as in Example 2.1(a). Then the only
two forbidden fragments are (1, 1, 1) and (0, 0, 0) (corresponding to the intransitive
binary relations a b c a and a ≺ b ≺ c ≺ a). These fragments are both
critical.
(b) Let X and K be as in Example 2.1(b). Then a fragment is forbidden if
it either has a “1” in more than two coordinates or a “0” in more than three
coordinates. The critical fragments are those which either have a “1” in exactly
three coordinates or a “0” in exactly four coordinates —that is, all permutations
of the fragments (1, 1, 1, ∗, ∗) and (0, 0, 0, 0, ∗) (where “∗” denotes an unspecified
entry). ♦
Let W (X ) be the set of critical fragments for X , and let κ(X ) := max{|w|;
w ∈ W (X )}. We say that X is a median space if κ(X ) = 2. This means that all
logical interrelations are confined to simple implications: for some j, k ∈ K and all
x ∈ X , xj = 0 implies that xk = 0, or xj = 0 implies that xk = 1.
4 Critical fragments have been introduced into social choice theory in [NP07, NP10] under
the name ‘critical families’. They are called ‘minimal infeasible partial evaluations (MIPEs)’ in
[DH10] .
HOW INDETERMINATE IS SEQUENTIAL MAJORITY VOTING? 59
Example 2.5. (a) Let X and K be as in Example 2.1(c). Then the critical
fragments are (0, 1, ∗, ∗), (0, ∗, 1, ∗), (0, ∗, ∗, 1), (∗, 0, 1, ∗), (∗, 0, ∗, 1), and (∗, ∗, 0, 1),
because each of these corresponds to the (contradictory) assertion that x is both
greater and less than some value. (For example, (0, ∗, 1, ∗) asserts that x < 1 but
x ≥ 3, which is impossible.) All these critical fragments have size 2, so this space
is a median space.
(b) Let X and K be as in Example 2.1(d). Then (1, 1, 1, 0) is a critical fragment
of size 4. Thus, this space is not a median space. Likewise, the spaces in Examples
2.1(a,b), are not median spaces, since they each contain critical fragments of size
3, as explained in Example 2.4. ♦
5 For the beautiful theory of median spaces from the perspective of combinatorial geometry,
top cycle as the outcome of non-strategic voting in simple binary tree agendas [Mil77].
60 K. NEHRING AND M. PIVATO
such that no other ordering agrees with the μ-majority on a larger set of pairwise
comparisons.
x
For any x ∈ {0, 1}K , let ≺ be the (possibly intransitive) binary relation on N
μ
defined by x. Moreover, for any μ ∈ Δ∗ (XN pr
), let ≺ be the binary relation defined
by Maj(μ) —the so-called majority tournament. An element c ∈ XN pr
is called a
c
directed Hamiltonian chain of Maj(μ) if all nearest-neighbour orderings in ≺ agree
μ c
with the orderings specified by ≺. In other words, if we represent ≺ as a linear
μ
directed graph C and represent ≺ as a complete directed graph D in the obvious
way, then C is a (directed) subgraph of D.
x x
Let ∗ be the transitive closure of ≺, augmented by all pairs (n, n) for n ∈ N ;
x x
then ∗ is a weak order (i.e. it is complete, reflexive and transitive). Let ≈∗ be the
x x x
symmetric part of ∗ . Then ≈∗ is an equivalence relation, and the ≈∗ -equivalence
x x
classes of N are linearly ordered by the asymmetric part ≺∗ of ∗ . The maximal
x
≈∗ -equivalence classes of N is called the top cycle of the tournament defined by
x. The next result completely characterizes the Condorcet set in the setting of
preference aggregation.
Proposition 2.6. Let μ ∈ Δ(XN
pr
).
x μ
(a) Cond (XN pr
, μ) = {x ∈ XN
pr
; ≺ is a Hamiltonian chain in ≺}.
μ x
(b) For all n, m ∈ N , we have n ≺∗ m if and only if n ≺ m for all
x ∈ Cond (XN , μ).
pr
8 See [Lis04] and [DL07] for earlier investigations of path-dependence in majoritarian judg-
ment aggregation.
HOW INDETERMINATE IS SEQUENTIAL MAJORITY VOTING? 61
a
6@
I
@
@
b - d
@
@R
@
c
As a consequence, the top cycle of the profile μ consists of all (and only) those
pr
alternatives which are ranked first by some preference order in Cond (XN , μ).
For example, consider the 4-permutahedron with alternatives a, b, c, d. Suppose
that one third of the population endorses each of the preference orderings a b
c d, b c d a and c d a b. For the corresponding majority
μ μ μ μ μ μ
tournament we have c a, d a, a b, b c, b d, and c d (see Figure 1).
By Proposition 2.6(a), the Condorcet set consists of the following five orderings:
a b c d, b c d a, c d a b, d a b c, c a b d. (In this
case, the top cycle is {a, b, c, d}.)
3. Global indeterminacy
Let X ⊂ {0, 1}K be an aggregation space. We say that a profile μ ∈ Δ∗ (X )
is globally indeterminate if, for any k ∈ K, there exist x, y ∈ Cond (X , μ) such
that xk
= yk . Thus, for every proposition in K, either truth value can arise from
sequential majority voting for a suitably chosen path. The space X is globally
indeterminate if Δ∗ (X ) contains some globally indeterminate profiles.
For a simple example, consider preference aggregation. Let N be a set of
social alternatives, and let μ ∈ Δ∗ (XN pr
) be a profile of preferences, such that every
element of N is in the top cycle (for example, the profile shown in Figure 1). Then
Proposition 2.6(b) implies that μ is globally indeterminate: for any a, b ∈ N , there
pr
exists some preference order in Cond (XN , μ) which prefers a to b, and there exists
pr
another preference order in Cond (XN , μ) which prefers b to a.
Let W3 (X ) be the set of critical fragments for X of size 3 or more. Let x ∈
{0, 1}K ; we say that x is critical for X if there exists a collection {w1 , . . . , wN } ⊆
W3 (X ) such that wn x for all n ∈ [1...N ] and K = N n=1 supp (wn ) —in other
words, x can be completely “covered” by critical fragments of size 3 or more. Let
Crit(X ) := {x ∈ {0, 1}K ; x is critical for X }. Theorem 2.2 of [NPP14] is the
following simple combinatorial characterization of global indeterminacy: a profile
is globally indeterminate if and only if the majority view at this profile is critical
for X .
Theorem 3.1. Let X ⊆ {0, 1}K . For any μ ∈ Δ∗ (X ),
(a) μ is globally indeterminate ⇐⇒ Maj(μ) ∈ Crit(X ) .
∗
Maj(X ) := {Maj(μ); μ ∈Δ (X )}.
(b) Let Then
X is globally indeterminate ⇐⇒ Maj(X ) ∩ Crit(X )
= ∅ .
62 K. NEHRING AND M. PIVATO
Example 3.2. Let X be the “committee” space from Example 2.1(b). The
critical fragments for this space are identified in Example 2.4(b). From these, it
is easy to see that Crit(X ) = {(0, 0, 0, 0, 0), (1, 1, 1, 1, 1)}. Let μ ∈ Δ∗ (X ) be the
profile where each of the ten views in X is supported by one tenth of the voters.
Then Maj(μ) = (0, 0, 0, 0, 0). Thus, μ is globally indeterminate.
3.1. Global indeterminacy due to symmetry. Global indeterminacy can
arise from the symmetries of the aggregation space. Let γ : K−→K be a permuta-
tion. Define the bijection γ∗ : {0, 1}K −→{0, 1}K by
(3.1) γ∗ (x)k := xγ(k) , for all x ∈ X and k ∈ K.
Let X ⊆ {0, 1}K be an aggregation space. We say that γ is a symmetry of X if
γ∗ [X ] = X . For example, every permutation of K is a symmetry of the committee
space from Example 2.1(b). For another example: let XN pr
be as in Section 2, let τ :
N −→N be any permutation of N , and define γ : K−→K by γ(n, m) = (τ (n), τ (m))
for all n, m ∈ N ; then γ is a symmetry of XN pr
(and every symmetry of XN pr
arises
in this fashion).
Let ΓX be the group of all symmetries of X . For any k ∈ K, the ΓX -orbit of k
is the set {γ(k); γ ∈ ΓX }. The set K is a disjoint union of ΓX orbits —call them
K1 , K2 , . . . , KN . (If X is the committee space from Example 2.1(b), or X = XN pr
,
then all of K is a single ΓX -orbit.) Let Kn := |Kn | for all n ∈ [1 . . . N ].
An element z ∈ {0, 1}K is ΓX -fixed if γ(z) = z for all γ ∈ ΓX . This is the case
if and only if z is constant on each of K1 , K2 , ..., KN . (Thus, there are exactly 2N
such ΓX -fixed points in {0, 1}K .) Theorem 3.1 has the following consequence:
Proposition 3.3. Let X ⊆ {0, 1}K . Suppose there exists a ΓX -fixed point
z ∈ {0, 1}K with the following properties:
(a) For all n ∈ [1 . . . N ], the fragment zKn is forbidden to X .
(b) There exists some x ∈ X such that #{k ∈ Kn ; xk = zk } > |Kn |/2 for
all n ∈ [1 . . . N ].
Then X is globally indeterminate.
For any x ∈ {0, 1}K , let x := #{k ∈ K; xk = 1}. If all of K is a single
ΓX -orbit (i.e. ΓX acts transitively on K), then the only ΓX -fixed points in {0, 1}K
are 0 := (0, 0, . . . , 0) and 1 := (1, 1, . . . , 1). Thus, the conditions of Proposition 3.3
reduce to:
(i) Either 0
∈ X and there is some x ∈ X with x < |K|/2;
(ii) or 1
∈ X and there is some x ∈ X with x > |K|/2.
Example 3.4. (Committee Selection) Let 0 ≤ I ≤ J ≤ K, and define XI,J;K com
:=
K
{x ∈ {0, 1} ; I ≤ x ≤ J}. Interpretationally, K is a set of K ‘candidates’, and
XI,J;K
com
is the set of all ‘committees’ comprised of at least I and at most J of these
candidates. For instance, the space in Example 2.1(b) was X2,2;5 com
.
If X = XI,J;K
com
, then it is easy to see that ΓX contains all permutations of K, so
ΓX acts transitively on K. Thus, if either (i) 0 < I < K/2 or (ii) K/2 < J < K, then
Proposition 3.3 implies that XI,J;K com
is globally indeterminate.9 On the other hand,
if I = 0 and J = 1, or if I = K −1 and J = K, then XI,J;K com
is majority determinate.
In between these extremes, if either I = 0 and 2 ≤ J < K/2, or K/2 < I ≤ K − 2
9 This fact was established in [NPP14] through a different argument.
HOW INDETERMINATE IS SEQUENTIAL MAJORITY VOTING? 63
but 1
∈ Maj(X0,2;5
com
).) This case belongs to an intermediate category, which we
might call partially determinate. ♦
Example 3.8. (a) Let Cline K be the linear convexity from Example 3.6(a).
Clearly, {k} ∈ ClineK for all k ∈ K (because {k} is an interval). Thus, the convexity
space XK line
is globally indeterminate.
(b) Suppose T is a taxonomic hierarchy on K, as in Example 3.6(b). If every
singleton set is a taxon, then the corresponding convexity space XT is globally
indeterminate. ♦
Many other McGarvey spaces are also globally indeterminate, such as the per-
mutahedron XN pr
defined at the end of Section 2. (See [NPP14] for more examples.)
This suggests a strong connection between the McGarvey property and global in-
determinacy. Indeed, we do not know of a single, ‘naturally occurring’ aggregation
space that is McGarvey but not globally indeterminate. One might thus conjecture
that any non-degenerate McGarvey space X is globally indeterminate. But the
next example falsifies this conjecture.
Example 3.9. Let X be the subset of {0, 1}5 defined by the two critical frag-
ments w1 = (1, 1, 0, 0, ∗) and w2 = (∗, 0, 0, 1, 1).14 By construction, we have
Crit(X ) = ∅ (since w1 and w2 disagree in the second and fourth coordinate).
Moreover, X has 28 elements which is more than 3/4 of 32 = |{0, 1}5 |; hence X
is McGarvey, by Proposition 2.4(a) of [NP11]. But by Corollary 3.5, X is not
globally indeterminate. Note also that X contains 1k for all k; this shows that the
hypothesis of convexity (i.e. closedness under intersections) cannot be dropped in
Proposition 3.7 above.
3.3. Susceptibility for indeterminacy. Global indeterminacy concerns the
existence of a profile such that, in each issue, both answers are possible via a
suitably chosen decision path. One may doubt the relevance of this concept and
the corresponding analysis, since existence results might only describe ‘worst cases’
that are very special and unlikely to happen. But the earlier example of global
indeterminacy in preference aggregation already suggests that global indeterminacy
is far from being special and unlikely, in the sense that it is obtained for a large set
of profiles. We will therefore now investigate how ‘likely’ such profiles are.
We will take the complexity of a profile as a proxy for its unlikelihood. We
will assess this complexity in terms of a very simple but instructive measure: the
number of voters needed to construct the profile. Formally, for any N ∈ N, let
n
Δ∗N (X ) := μ ∈ Δ∗ (X ) ; ∀ x ∈ X , μ(x) = for some n ∈ [0 . . . N ] .
N
∗
From a social choice perspective, ΔN (X ) is the set of profiles which can be gener-
ated by a population of exactly N voters. From a geometric perspective, Δ∗N (X )
can be visualized as a discrete ‘mesh’ of density 1/N embedded in the set Δ(X ). Let
Δ∗ind (X ) := {μ ∈ Δ∗ (X ) ; μ is globally indeterminate}. We define the susceptibility
for indeterminacy of X to be
η(X ) := min {N ∈ N ; Δ∗N (X ) ∩ Δ∗ind (X )
= ∅}.
(with η(X ) = ∞ if X is not globally indeterminate). From a social choice per-
spective, η(X ) is the minimum number of voters needed to construct a globally
indeterminate profile. From a geometric perspective, η(X ) places an upper bound
on the “thickness” of Δ∗ind (X ): if η(X ) > N , then Δ∗ind (X ) cannot contain a sphere
14 Here, as usual, the “∗” indicates an unspecified coordinate —in other words, supp (w ) =
1
{1, 2, 3, 4}, and supp (w2 ) = {2, 3, 4, 5}.
HOW INDETERMINATE IS SEQUENTIAL MAJORITY VOTING? 65
of radius greater than N , where := 1 − K 1
. Thus, η(X ) measures the suscepti-
bility of X to global indeterminacy: if η(X ) is small, then X is very susceptible.
Evidently, for any X ⊆ {0, 1}K , we have η(X ) ≥ 3. The next four results show
that several common aggregation spaces are very susceptible to global indetermi-
nacy, since they exhibit the minimal value η(X ) = 3.
Proposition 3.10. If X has a critical fragment of size K, then η(X ) = 3.
For example, if X is the truth function aggregation space from Example 2.1(d),
then η(X ) = 3, because of the critical fragment of size 4 shown in Example 2.5(b).
Now let us consider the classical problem of Arrovian preference aggregation,
as represented by judgement aggregation on the permutahedron XN pr
, which was
defined at the end of Section 2.
Proposition 3.11. If |N | ≥ 3, then η(XN
pr
) = 3.
For a third example, consider the problem of aggregating equivalence relations,
as formalized by [FR86]. Let N be a finite set, and let K be a subset of N × N
containing exactly one of the pairs (n, m) or (m, n) for each n
= m ∈ N . Thus, an
element of {0, 1}K represents a symmetric, reflexive binary relation (i.e. undirected
graph) on N . Let XN eq
⊂ {0, 1}K be the set of all equivalence relations on N . Thus,
judgement aggregation on XN eq
represents the problem of constructing a collective
classification of the elements of N . Each voter has in mind her own classification
(represented by some equivalence relation on N ); our problem is to aggregate these
into some collective classification.
Proposition 3.12. If N ≥ 3, then η(XN
eq
) = 3.
We now turn to the problem of resource allocation. Fix M, D ∈ N, and consider
the D-dimensional ‘discrete cube’ [0...M ]D . Each element x ∈ [1...M ]D can be
represented by a point Φ(x) := x ∈ {0, 1}D×M defined as follows:
1 if xd ≥ m;
(3.2) for all (d, m) ∈ [1...D] × [1...M ], (d,m) :=
x
0 if xd < m.
D D
This defines an injection Φ : [0...M ] −→{0, 1}D×M . Any subset of P ⊂ [0...M ]
can thereby be represented as a subset X := Φ(P) ⊂ {0, 1}D×M . Judgement
aggregation over X thus represents social choice over a D-dimensional ‘policy space’,
where each voter’s position represents her ideal point in P, the set of feasible
policies. This framework is especially useful for resource allocation problems, as we
now illustrate. Let
D
D
DM := x ∈ [0...M ] ; xd = M ,
(3.3)
d=1
and XM,D
Δ
:= Φ[D
M] ⊂ {0, 1}D×M .
Geometrically, D M is a ‘discrete simplex’; points in M represent all ways of
D
We have seen that globally indeterminate profiles are ‘easy’ to construct for
XNpr
, XN eq
, and XM,D
Δ
. But the committee space XI,J;Kcom
from Example 3.4 exhibits
a more complex pattern. Consider, for instance, the space X4,6;10 com
, i.e. the space
of all committees that contain at least 4 and at most 6 members of a set of 10
candidates. Define 0 := (0, 0, . . . , 0) and 1 := (1, 1, . . . , 1). Theorem 3.1(a) says
that profile μ ∈ Δ∗ (X4,6;10com
) is globally indeterminate if and only if either Maj(μ) =
0 or Maj(μ) = 1 (see Lemma A.6 in the Appendix). Without loss of generality,
by symmetry, suppose the former, i.e. μ k < 12 for all k ∈ [1 . . . K]. Since each
feasible view endorses at least 4 candidates, we have k ≥ 4. Denoting by
kμ
k∗ the candidate with maximal popular support, we thus obtain 10 4
≤ μk∗ < 12 .
Satisfaction of this inequality requires at least five agents; together with Proposition
com
3.14(b) below we thus obtain η(X4,6;10 ) = 5.
The argument just given can be generalized to give the lower bound on
com
η(XI,J;K ) in part (a) of the following result.
4. Full indeterminacy
Global indeterminacy means that, for some profile, any answer can be obtained
on each issue by choosing a suitable decision path. An even stronger form of
indeterminacy, which we shall henceforth refer to as full indeterminacy, occurs if,
for some profile, any logically possible combination of answers across issues can be
obtained via an appropriate decision path, i.e. if the corresponding Condorcet set
contains all possible views.
Formally, a profile μ ∈ Δ∗ (X ) is fully indeterminate if Cond (X , μ) = X . We
say that X is fully indeterminate if there exists some μ ∈ Δ∗ (X ) which is fully
indeterminate.15
15 For all k ∈ K, suppose that there is some x ∈ X with x = 1, and some x ∈ X with
k
xk = 0. Then full indeterminacy of X implies global indeterminacy. Thus, under a very mild and
natural hypothesis (which is almost always satisfied in practice), full indeterminacy is a logically
stronger property than global indeterminacy. However, we do not actually need this hypothesis
for any of the results in this paper, so we will not assume it in what follows.
HOW INDETERMINATE IS SEQUENTIAL MAJORITY VOTING? 67
K
Example 4.1. Fix J ∈ ( K 2 ...K], and let XJ,J;K := x ∈ {0, 1} ; x = J .
com
(Thus, XJ,J;K
com
is the set of all ‘committees’ comprised of exactly J out of K can-
com
didates.) Let μ be the uniform distribution on XJ,J;K com
. Then Cond XJ,J;K ,μ =
XJ,J;K
com
, hence μ is fully indeterminate. (The proof is straightforward: if there are
exactly J open slots and K viable candidates, and the slots are allocated on a
‘first come, first serve’ basis, then the slots will simply be allocated to the first J
candidates.) ♦
Now let D ≥ 3, let M ∈ N, and let XM,D
Δ
be as in eqn.(3.3) of Section 3. We
K
define 0 := (0, 0, . . . , 0) ∈ {0, 1} .
Proposition 4.2. Let μ ∈ Δ∗ (XM,D
Δ
). Then μ is fully indeterminate if and
only if Maj(μ) = 0.
For example, for all d ∈ [1...D], let xd be the element of XM,D
Δ
which allocates
all M dollars towards claimant d. (Thus, for all m ∈ [1...M ], we have xdd,m = 1,
Δ
while xdc,m = 0 for all c ∈ [1...D] \ {d}.) Let μ ∈ Δ∗ XM,D be the profile which
allocates weight 1/D to each of x1 , . . . , xD . Then Maj(μ) = 0, so Proposition 4.2
implies that μ is fully indeterminate.16
For any x ∈ X and z ∈ {0, 1}K , recall that x z if there is no y ∈ X \ {x}
which is between x and z (i.e. such that, for all k ∈ K and c ∈ {0, 1}, (xk = c =
zk ) =⇒ (yk = c)). We say that z ∈ {0, 1}K is a panopticon for X if x z for all
x ∈ X (this implies that z
∈ X , because any element of X is between itself and
every other element of X ). Heuristically, from a panopticon, one can ‘see’ each
element of X without the view being blocked by any other elements. For example,
1 is a panopticon for XJ,J;K
com
. Let Pan(X ) be the set of all panoptica for X . The
next result is this section’s key observation:
Proposition 4.3. Let X ⊆ {0, 1}K .
(a) For any μ ∈ Δ∗ (X ), μ is fully indeterminate ⇔ Maj(μ) ∈ Pan(X ) .
(b) X is fully indeterminate ⇐⇒ Maj(X ) ∩ Pan(X )
= ∅ .
redistribution, the elements of [1...G] represent the potential recipients of government largesse
(e.g. state governments seeking federal assistance; economic sectors seeking subsidies, etc.). If the
redistribution is decided by a committee (e.g. the Senate), and each potential recipient controls
roughly the same number of committee members (e.g. each state has two senators), then the
resulting profile closely resembles this fully indeterminate profile.
68 K. NEHRING AND M. PIVATO
Δ
so Maj(XM,2 ) = XM,2
Δ
. (The argument is very similar to Example 2.5(a).) Thus,
XM,2 is not fully indeterminate.
Δ
♦
Recall from §3.2 that a space X is called McGarvey if Maj(X ) = {0, 1}K . If X is
McGarvey, then X is fully indeterminate if and only if Pan(X )
= ∅. However, it is
not clear that the McGarvey property is compatible with Pan(X )
= ∅. Heuristically,
the problem is that, to have Pan(X )
= ∅, the space X must be a relatively ‘small’
subset of {0, 1}K , whereas to be McGarvey, X must be relatively ‘large’. The next
proposition illustrates this conflict. Define 1 := (1, 1, . . . , 1) ∈ {0, 1}K . For any
k ∈ K, recall the definition of the view 1k ∈ {0, 1}K from §3.2.
Proposition 4.5. Suppose X contains 1 and 1k , for all k ∈ K. Then X is
McGarvey, but Pan(X ) = ∅ (so X is not fully indeterminate).
Example 4.6. Recall the space XN eq
from §3.3. Note that 1 ∈ XN
eq
(representing
the relation where all elements of N are equivalent). Also, for all n, m ∈ N , we
have 1(n,m) ∈ XN eq
(representing the equivalence relation where n ∼ m and all other
elements are non-equivalent). Thus, Proposition 4.5 implies that XN eq
is McGarvey,
but not fully indeterminate. ♦
Propositions 4.3(b) and 4.5 together suggest that many naturally occurring
aggregation spaces will not be fully indeterminate (although they may still be glob-
ally indeterminate via Theorem 3.1). On the other hand, we will see in §6 that
aggregation spaces will frequently be ‘almost’ fully indeterminate.
Full indeterminacy also arises as a consequence of symmetry in the judgement
aggregation problem. Let X ⊂ {0, 1}K and let ΓX be the group of all symmetries
of X , as defined in §3.1. For any x ∈ {0, 1}K , the ΓX -orbit of x is the set Γ(x) :=
{γ∗ (x); γ ∈ ΓX }. Distinct ΓX -orbits are disjoint. Thus, X is a disjoint union of
ΓX -orbits. We say X is homogeneous if all of X is contained in one ΓX -orbit. (For
example, XJ,J;K
com
is homogeneous; however, XN pr
is not.)
For any k ∈ K, recall that the ΓX -orbit of k is the set {γ(k); γ ∈ ΓX }, and
K is a disjoint union of ΓX orbits —call them K1 , K2 , . . . , KN . For any x ∈ X and
any n ∈ [1 . . . N ], let xKn := #{k ∈ Kn ; xk = 1}. It is easy to see that xKn =
γ(x)Kn for any γ ∈ ΓX . Thus, if X is homogeneous, then xKn = yKn for all
x, y ∈ X .
Proposition 4.7. Suppose X is homogeneous, and for all n ∈ [1 . . . N ], we have
xKn
= |Kn |/2 for some (and hence, all) x ∈ X . Then X is fully indeterminate.
Example 4.8. Let L and R be two finite sets, and let K := L × R. Thus,
an element of {0, 1}K can be interpreted as a bipartite graph, where the vertices
are partitioned into the sets L and R. Suppose R := |R| is odd. For all
∈ L,
let R ∈ [0 . . . R], and let K := {
} × R ⊂ K. Finally, for any x ∈ X , define
xK := #{k ∈ K ; xk = 1} (i.e. the number of edges from the vertex
into the
set R in the graph represented by x). Define
X := x ∈ {0, 1}K ; xK = R for all
∈ L .
Any permutation of R induces a permutation of K in the obvious way, and each
of these is a symmetry of X . If the elements {R }∈L are all distinct, then these
permutations are the only symmetries of X . For all
∈ L, we have |K | = R, while
xK
= R/2 for all x ∈ X (because R is odd). Thus, Proposition 4.7 says that X
is fully indeterminate. ♦
HOW INDETERMINATE IS SEQUENTIAL MAJORITY VOTING? 69
5. Generalized Antichains
By Proposition 4.3, a necessary condition for full indeterminacy is that the set
of panoptica is non-empty. It is therefore desirable to find structural conditions on
X which guarantee that Pan(X )
= ∅. For any x, y ∈ {0, 1}K , define x ⊕ y := z ∈
{0, 1}K by zk := (xk + yk ) mod 2, for all k ∈ K. For any x ∈ {0, 1}K , define the
involution Ix : {0, 1}K −→{0, 1}K by
(5.1) Ix (y) := x ⊕ y, for all y ∈ {0, 1}K .
17 Note that such transitivity is neither necessary nor sufficient for X to be homogeneous. In
Example 4.8, X is homogeneous but ΓX does not act transitively on K. In Example 4.10, ΓX acts
transitively on K, but X is not homogeneous.
70 K. NEHRING AND M. PIVATO
Thus, Ix simply acts on {0, 1}K by ‘inverting’ certain coordinates and leaving the
rest alone. For any x, y ∈ {0, 1}K , write x ≤ y if xk ≤ yk for all k ∈ K. A subset
X ⊆ {0, 1}K is an antichain if, for all x, y ∈ X , we have x
≤ y. (For example,
XJ,J;K
com
is an antichain.) We say X is a generalized antichain if Iz [X ] is an antichain
for some z ∈ {0, 1}K .
Proposition 5.1. Pan(X )
= ∅ if and only if X is a generalized antichain.
Proposition 5.2. (a) If X ⊆ {0, 1}K is fully indeterminate, then X
is a generalized antichain.
(b) Let K be odd, and let J := K/2. Then
the largest fully indetermi-
nate subsets of {0, 1}K have cardinality K
J .
Note that the converse to Corollary 5.2(a) is false: not every generalized an-
tichain is fully indeterminate, as Example 4.4 shows. Another counterexample in-
volves aggregation spaces representing taxonomic hierarchies, the convexity spaces
defined in Example 3.6(b). Given a taxonomic hierarchy T on the set K, we define
XT := {1T ; T ∈ T} ⊂ {0, 1}K . A taxon T ∈ T is minimal if T does not contain
any proper sub-taxa. Note that minimal taxa are not necessarily singletons.
Proposition 5.3. Let T be a taxonomic hierarchy on K.
(a) XT is a generalized antichain if and only if every non-minimal taxon
in T contains at least two non-singleton minimal taxa.
(b) However, XT is never fully indeterminate.
We conclude this section with some necessary and sufficient conditions for
X to be a generalized antichain. Let x, y ∈ {0, 1}K . Recall that dH (x, y) :=
#{k ∈ K ; xk
= yk } is the Hamming distance from x to y. We say that x and y
are adjacent if dH (x, y) = 1. The next result says that any sufficiently ‘dispersed’
subset of {0, 1}K will be a generalized antichain.
Proposition 5.4. Let X ⊆ {0, 1}K .
(a) Suppose X is a generalized antichain. If x ∈ X , and y ∈ {0, 1}K is
adjacentto x, then y
∈ X .
(b) If 2−dH (x,y) < 1, then X is a generalized antichain.
x=y∈X
and perhaps additional constraints.) If X is “large enough” (e.g. satisfies the hy-
potheses of Propositions 4.7 or 4.9), then Maj(X ) ∩ Pan(X )
= ∅, so that X is fully
indeterminate.
line ∞
Proposition 6.2. (a) The sequence {XK }K=1 is asymptotically fully
indeterminate.
2 ∞
(b) The sequence {XD }D=1 is asymptotically fully indeterminate.
Conclusion
The Condorcet set includes all minimally acceptable compromises between ma-
joritarianism and logical consistency. It also provides a compact description of the
possible outcomes of sequential majority rule: the way in which real social deci-
sions often emerge from an uncoordinated sequence of ad hoc judgements unfolding
over time. Unfortunately, the Condorcet set is quite large for almost any nontriv-
ial judgement aggregation problem. In many plausible scenarios, path-dependence
can make the truth value of every proposition susceptible to manipulation. In some
cases, any logically consistent outcome can arise from a suitably chosen path. In
short: history matters.
Several problems remain open. For example, let P be the set of all paths on K.
For any μ ∈ Δ∗ (X ), sequential majority rule defines a function F : P−→X . Let ν
be the uniform probability distribution on P; what is the distribution of F (ν)? If
F (ν) is almost-uniformly distributed on X , this represents an especially acute form
of full indeterminacy. On the other hand, if F (ν) is mostly concentrated on one or
a few views, then this perhaps recommends these views as superior social choices.
We have focused on sequential majority vote, because the majority view has
several significant properties [May52, DL10]. However, we could obtain greater
path-independence by allowing some propositions to remain undecided (e.g. by
using supermajoritarian voting in some coordinates), sacrificing anonymity (e.g.
by using weighted voting rules) or both (e.g. by using a system with vetoes or
oligarchies). In particular, if we use a system of voting rules satisfying the intersec-
tion property, then the outcome is guaranteed to be logically consistent, and hence,
path-independent [NP07, Proposition 3.4]. For example, for all k ∈ K, let Nk be
the size of the largest critical fragment containing k, and let qk := max{ 12 , 1 − N1k }.
Suppose we decide the truth value of k via qk -supermajoritarian voting for each
k ∈ K; then the outcome will be path-independent [NP07, Fact 3.4].
Is there an optimal tradeoff between decisiveness, neutrality, anonymity, and
path-independence? One possibility: simple majorities could make ‘provisional’
rulings on the truth of certain propositions, but these rulings would only be treated
as ‘precedents’ (i.e. binding on later decisions) if they exceeded some supermajority
threshold —otherwise they could be overturned by a later, larger supermajority.
Appendix: Proofs
Proofs from Section 3.
Proof of Proposition 3.3. Let x ∈ X be the element described in hypothesis (b)
of Proposition 3.3. Let δx ∈ Δ∗ (X ) be the profile which assigns mass 1 to x, and
HOW INDETERMINATE IS SEQUENTIAL MAJORITY VOTING? 73
let
1
μ := δγ(x) .
|ΓX |
γ∈ΓX
It is easy to verify that Maj(μ) = z.
Claim 1: z ∈ Crit(X ).
Proof: Let n ∈ [1 . . . N ]. The fragment zKn is forbidden, by hypothesis (a).
So it contains a critical fragment, say wn . We must have |wn | ≥ 3 (because
Maj(μ) = z, but a critical fragment of size two cannot receive majority support
in both coordinates). For all γ ∈ ΓX , the fragment γ(wn ) is also critical
(because γ(X ) = X ), and is also a fragment of z (since γ(z) = z). But Kn is
a ΓX -orbit, so the family {γ(wn ); γ ∈ ΓX } covers all of zKn .
This argument works for each n ∈ [1 . . . N ]. We conclude that z is covered
by critical fragments, as desired. 3 Claim 1
At this point, Theorem 3.1(a) implies that μ (and thus, X ) is fully indeterminate.
2
The proof of Proposition 3.7 requires some background about convexity structures.
Let C be a convexity structure on K. For any subset J ⊆ K, we define its convex
hull:
conv(J ) := {C ∈ C ; J ⊆ C}.
Note that I ⊆ J implies conv(I) ⊆ conv(J ). Thus,
(A1) conv(I) ⊆ conv(J ).
IJ
1 1 1 1
(a) (b)
7 7 7 7
2 2 2 2
x1 x2 x3
6 3 6 3 6 3 6 3
5 4 5 4 5 4 5 4
A1 (c) A1 A1 A1 (d)
x x1 x2 x3
A3 A2 A3 A2 A3 A2 A3 A2
globally indeterminate if and only if this graph is connected, but not complete.18
Proof: See Proposition 3.4(c) of [NPP14]. 2
18 Recall that a graph is complete if every node is connected to every other node by an edge,
Proof of Proposition 3.13. For all d ∈ {1, 2, 3}, let xd be the element of XM,D
Δ
which allocates all M dollars towards claimant d. (Thus, for all m ∈ [1...M ], we
have xdd,m = 1, while xdc,m = 0 for all c ∈ [1...D] \ {d}.) Define μ ∈ Δ∗3 (XM,D
Δ
) by
∗
μ[xd ] = 3 for d = 1, 2, 3. Then Maj(μ) = 0, so md = 0 for all d ∈ [1 . . . D], so
1
so no element x ∈ {0, 1}K different from 0 and 1 can be critical for XI,J;Kcom
. 2
Proof: For any x ∈ {0, 1} , define x := (¬xk )k∈K . For any μ ∈ Δ({0, 1}K ),
K
define μ (x) := μ(x ) for all x ∈ {0, 1}K ; then clearly Maj(μ ) = Maj(μ) . In
particular, Maj(μ) = 0 if and only if Maj(μ ) = 1. It is easy to check that
∗ ∗
,J ;K := {x ; x ∈ XI,J;K }; thus, ΔN (XI ,J ;K ) := {μ ; μ ∈ ΔN (XI,J;K )}.
XIcom com com com
com
η(XI,J;K ) = min N ∈ N ; ∃ μ ∈ Δ∗N (XI,J;K com
) with Maj(μ) = 0 or 1
= min N ∈ N ; ∃ μ ∈ Δ∗N (XI,J;K com
) with Maj(μ ) = 1 or 0
= min N ∈ N ; ∃ μ ∈ Δ∗N (XIcom
,J ;K ) with Maj(μ ) = 1 or 0
= η(XIcom
,J ;K ),
as claimed. 3 Claim 1
Now, K − 2I = 2J − K and 2J − K = K − 2I. Thus, min K−2I K K
, 2J −K =
K K
min K−2I , 2J−K . This, together with Claim 1, means that the inequality
(3.4) holds for XI,J;K
com
, J and K if and only if it holds for XIcom
,J ;K , J and K ;
thus, we can substitute one problem for the other. Furthermore, if K − 2I <
2J − K, then K − 2I ≥ 2J − K. Thus, by exchanging XIcom ,J ;K for XI,J;K if
com
The proof of (b)[ii] is similar; simply reverse the roles of ‘0’ and ‘1’ in the proof
of (b)[i]. 2
Proof of Proposition 4.3. (a) follows immediately from Lemma 2.3(b), and (b)
follows from (a). 2
Proof of Proposition 4.5. The space {1k }k∈K ∪ {1} is a convexity space; it
is McGarvey by Lemma A.2. Moreover, any superset of a McGarvey space is
clearly also McGarvey; thus, X is McGarvey. It remains to show that X does
not admit a panopticon. By definition, a panopticon must lie outside X . But 0
cannot be a panopticon for X because any 1k is between 0 and 1. Thus consider
x
∈ X with x
= 0. By assumption, x must have at least two ones, say x = 1
and xm = 1 with
= m, and at least one zero, say xh = 0 for h
∈ {
, m}. But
then, the element 1 ∈ X is between the elements x and 1h ∈ X , i.e. x is not a
panopticon. 2
At least two of these taxa intersect J . (by contradiction) If none of the minimal
taxa in C intersects J , then C ∩ J = ∅, contradicting Claim 2. Suppose only one
of the minimal taxa inside C intersects J —call it D. Then C ∩ J = D ∩ J , so
containment (A3.1) is weakly satisfied. Meanwhile, D ∩J = (C ∩J )(C\D)
(C ∩J ), so containment (A3.2) is strictly satisfied. Thus, 1D is strictly between
1C and 1J . Contradiction.
At least two of these taxa intersect J . (by contradiction) If none of the
minimal taxa in C intersect J , then C ∩ J = ∅, contradicting Claim 2. Suppose
only one of the minimal taxa in C intersects J —call it D. Then C ∩ J = (D ∩
J ) (C \ D) (D ∩ J ), so the reverse of containment (A3.1) is strictly
satisfied.
Meanwhile, D = C (C \D), so D ∩J = (C ∩J ) (C \ D) ∩ J = C ∩J ,
because (C \ D) ∩ J = ∅. Thus, the reverse of containment (A3.2) is weakly
satisfied. Thus, 1C is strictly between 1D and 1J . Contradiction.
“⇐=” Suppose J satisfies the conditions (a) and (b). To show that 1J is
a panopticon for XT , we must show, for any taxon C ∈ T, that 1C 1J . Let
D ∈ T; we will show that 1D cannot be strictly between 1C and 1J . Since T is
a taxonomic hierarchy, there are only three cases: either D ⊂ C, or C ⊂ D, or D
is disjoint from C.
Case 1. Suppose D ⊂ C. Then C must be non-minimal, so condition (b) says
that C contains some other taxon B besides D, such that B ∩ J
= ∅. This means
that C ∩ J D ∩ J . Thus, containment (A3.1) is falsified.
Case 2. Suppose C ⊂ D. Then D must be non-minimal, so condition (b) says
that D contains some other taxon B besides C, such that B ∩ J
= ∅. This means
that D ∩ J C ∩ J , which means D ∩ J C ∩ J . Thus, containment
(A3.2) is falsified.
Case 3. Suppose D is disjoint from C. Then D ∩ J is disjoint from C ∩ J , so
if C ∩ J
= ∅, then containment (A3.1) is falsified.
If C ∩ J = ∅, then condition (b) implies that C must be minimal. But then
condition (a) implies that every taxon in T intersects J . In particular, D ∩J
=
∅. But D = D ∩ C (because D ⊂ C , because C and D are disjoint). Thus, we
have D ∩ C ∩ J
= ∅. Thus, D
⊇ J ∩ C , so containment (A3.2) is falsified.
In any case, one of the conditions in (A3) fails, so 1D cannot be strictly between
1 and 1J . This holds for all D ∈ T. Thus, 1C 1J .
C
2
(a) and (b) of Lemma A.8. By construction, every minimal taxon intersects J ;
hence the first alternative of (a) is satisfied.
To see (b), suppose C is non-minimal. By hypothesis, C contains at least two
non-singleton minimal taxa, each of which contains exactly one point in J —thus
each of them intersects both J and J .
Thus Lemma A.8 says that 1J is a panopticon.
(b) We will show that Maj(XT ) contains no panopticon points. Let J ⊆ K.
If 1J ∈ Maj(XT ), then J is a union of taxa of T [NP11, Proposition 6.2(a)].
Every taxon in T is a disjoint union of minimal taxa. Thus, J is a disjoint union
of minimal taxa of T. Thus, if C ∈ T is a minimal taxon, then we have:
(A4) C ∩ J
= ∅ =⇒ C ⊆ J and C ∩ J
= ∅ =⇒ C ⊆ J .
Finally, for any X ⊆ {0, 1}K , define Z(X ) := Z(x, y) ⊆ {0, 1}K .
x=y∈X
z ∈ Z(X ) ⇐⇒ z ∈
(c) ⇒ Iz (x)
≤ Iz (y), ∀ x, y ∈ X
Z(x, y), ∀ x, y ∈ X ⇐(†)
⇐⇒ Iz (X ) is an antichain . Here, (†) is by part (a). 2
K/2
where the last step is because we assume N < 43 . As K→∞, the probability
of strict inequality increases to 1, and in this case, Proposition 5.4(b) implies that
X is a generalized antichain. 2
Proofs from Section 6. We will use the following simple but elegant result
of [Sze43]. For completeness, we include the proof.
Lemma A.10. The expected number of directed Hamiltonian paths which exist
in a randomly generated tournament (where all edges are independent random
variables with both orientations having equal probability) is given by 2NN−1
!
.
Proof: There are N ! directed Hamiltonian chains through N . For any such
chain, and any random tournament, there is a probability 1/2N −1 that the chain
can be embedded in the tournament (because each of the N − 1 edges of the
chain has probability 1/2 of being compatible with the corresponding edge in the
tournament, and these N − 1 events are all independent random variables). 2
N!
Maj(μ) = x; then Proposition 2.6(a) says that |Cond (XN
pr
, μ) | ≥ . Thus,
2N −1
√ N
Here (∗) is because Stirling’s formula says that N ! ≈ 2πN Ne .
(b) Fix c ∈ (0, 1). Let L := cN and let R := N − L. Let x ∈ {0, 1}K
represent the complete bipartite graph which has L ‘left’ vertices and R ‘right’
vertices, where every left vertex is linked to every right vertex (but there are no
links between any two left vertices or any two right vertices). The space XN eq
∗
is McGarvey [NP11, Example 3.9(a)], so there exists some μ ∈ Δ (XN ) such eq
y(r,r ) .
(5) Next, fix some r0 ∈ RM . For each
∈ [M . . . L], the path ζ visits
ζ
the coordinate (r0 , v ). Again, the majority prevails, so F(r 0 ,v )
(μ) =
x(r0 ,v ) = 1 = y(r0 ,v ) (encoding the equivalence r0 ∼ v ).
HOW INDETERMINATE IS SEQUENTIAL MAJORITY VOTING? 85
Claim 3: |XN | ≤ N N .
eq
J
is McGarvey by Lemma A.2. For any j ≤ k ∈ K, we have 1 [j...k]
1 if and
86 K. NEHRING AND M. PIVATO
Thus, XK line
is asymptotically fully indeterminate.
(b) Let D ∈ N, let K = 2D , and let ϕ : K−→{0, 1}D be some bijection. Let
C2
D be the hypercube convexity on {0, 1} , and define Φ : P({0, 1} )−→{0, 1}
D D K
−1
2
by Φ(C) := 1ϕ (C) for all C ⊆ {0, 1}D . Then define XD := Φ[C2 K
D ] ⊂ {0, 1} .
K
Finally, let E := {x ∈ {0, 1} ; x is even}, and define e := Φ[E] ∈ {0, 1} .
D
bijective correspondence between these subcubes and the set of {0, 1}-valued
functions whose domain is any subset of [1...D]. Any such function can be
represented in a unique way by an element of {0, 1, ∗}D in the obvious way.
2
Thus, |XD | = |{0, 1, ∗}D | = 3D . 3 Claim 2
2
Now let Y := {x ∈ XD ; x represents a subcube of cardinality 1 or 2 in {0, 1}D .
Claim 3: |Y| = (1 + D D
2) 2 .
Proof: Clearly, {0, 1} has exactly 2D subcubes of cardinality 1 (i.e. vertices).
D
2 2
Now, e ∈ Maj(XD ) because XD is McGarvey by Lemma A.2. Thus
2
2 log2 |XD (e)| D · log2 (3) − 1 D · log2 (3) − 1
h(XD ) ≥ 2|
≥ = −−−−→
− 1,
log2 |XD (∗) log2 |3D | D · log2 (3) D→∞
2
where (∗) is by eqn.(A5) and Claim 2. Thus, XD is asymptotically fully indeter-
minate. 2
References
[BH83] Hans-J. Bandelt and Jarmila Hedlı́ková, Median algebras, Discrete Math. 45 (1983),
no. 1, 1–30, DOI 10.1016/0012-365X(83)90173-5. MR700848 (84h:06015)
[BJ91] Jean-Pierre Barthélémy and Melvin F. Janowitz. A formal theory of consensus. SIAM
Journal of Discrete Mathematics, 4(3):305–322, 1991.
[Bla48] Duncan S. Black. On the rationale of group decision-making. Journal of Political Econ-
omy, 56:23–34, 1948.
[BM81] Jean-Pierre Barthélémy and Bernard Monjardet, The median procedure in cluster anal-
ysis and social choice theory, Math. Social Sci. 1 (1980/81), no. 3, 235–267, DOI
10.1016/0165-4896(81)90041-X. MR616379 (83c:92087)
[BM88] Jean-Pierre Barthélémy and Bernard Monjardet. The median procedure in data analysis:
new results and open problems. In Classification and related methods of data analysis
(Aachen, 1987), pages 309–316. North-Holland, Amsterdam, 1988.
[Con85] Marquis de Condorcet. Essai sur l’application de l’analyse à la probabilité des décisions
rendues à la pluralité des voix. Paris, 1785.
[CT91] Thomas M. Cover and Joy A. Thomas, Elements of information theory, Wiley Series
in Telecommunications, John Wiley & Sons Inc., New York, 1991. A Wiley-Interscience
Publication. MR1122806 (92g:94001)
[DH10] Elad Dokow and Ron Holzman, Aggregation of binary evaluations, J. Econom. Theory
145 (2010), no. 2, 495–511, DOI 10.1016/j.jet.2007.10.004. MR2886978
[DL07] Franz Dietrich and Christian List, Arrow’s theorem in judgment aggregation, Soc.
Choice Welf. 29 (2007), no. 1, 19–33, DOI 10.1007/s00355-006-0196-x. MR2318335
(2009d:91049)
[DL10] Franz Dietrich and Christian List, Majority voting on restricted domains, J. Econom.
Theory 145 (2010), no. 2, 512–543, DOI 10.1016/j.jet.2010.01.003. MR2886979
[Eng97] Konrad Engel, Sperner theory, Encyclopedia of Mathematics and its Applications,
vol. 65, Cambridge University Press, Cambridge, 1997. MR1429390 (98m:05187)
[FR86] Peter C. Fishburn and Ariel Rubinstein, Aggregation of equivalence relations, J. Classi-
fication 3 (1986), no. 1, 61–65, DOI 10.1007/BF01896812. MR845458 (87h:90008)
[GK78] Curtis Greene and Daniel J. Kleitman, Proof techniques in the theory of finite sets,
Studies in combinatorics, MAA Stud. Math., vol. 17, Math. Assoc. America, Washington,
D.C., 1978, pp. 22–79. MR513002 (80a:05006)
[Gui52] George-Théodule Guilbaud. Les theories de l’interet general et le probleme logique de
l’aggregation. Econ. Appl., 5(4):501–584, 1952. Complete English translation in: Elec-
tronic Journal of History Probab. Statist. 4 (2008).
[HLB96] Guillaume Hollard and Michel Le Breton, Logrolling and a McGarvey theorem
for separable tournaments, Soc. Choice Welf. 13 (1996), no. 4, 451–455, DOI
10.1007/s003550050043. MR1408184
[KS86] Lewis Kornhauser and Lawrence Sager. Unpacking the court. Yale Law Journal, 96:82–
117, 1986.
[Lis04] Christian List. A model of path-dependence in decisions over multiple propositions.
American Political Science Review, 98(3):495–513, 2004.
[LNP10] Tobias Lindner, Klaus Nehring, and Clemens Puppe. Allocating public goods via the
midpoint rule. (preprint), 2010.
[LP02] Christian List and Philip Pettit. Aggregating sets of judgements: An impossibility result.
Economics and Philosophy, 18:89–110, 2002.
[LP09] The handbook of rational and social choice, Oxford University Press, Oxford, 2009.
An overview of new foundations and applications; Edited by Paul Anand, Prasanta K.
Pattanaik and Clemens Puppe. MR2599282 (2011f:91001)
88 K. NEHRING AND M. PIVATO
[LP10] Christian List and Ben Polak, Introduction to judgment aggregation, J. Econom. Theory
145 (2010), no. 2, 441–466, DOI 10.1016/j.jet.2010.02.001. MR2886976
[May52] Kenneth O. May, A set of independent necessary and sufficient conditions for simple
majority decisions, Econometrika 20 (1952), 680–684. MR0050857 (14,392c)
[McG53] David C. McGarvey, A theorem on the construction of voting paradoxes, Econometrica
21 (1953), 608–610. MR0062415 (15,976a)
[Mil77] Nicholas R. Miller. Graph-theoretical approaches to the theory of voting. American
Journal of Political Science, 21:769–803, 1977.
[Mon12] Philippe Mongin, The doctrinal paradox, the discursive dilemma, and logical aggregation
theory, Theory and Decision 73 (2012), no. 3, 315–355, DOI 10.1007/s11238-012-9310-y.
MR2964824
[NP07] Klaus Nehring and Clemens Puppe, The structure of strategy-proof social choice—Part
I: General characterization and possibility results on median spaces, J. Econom. Theory
135 (2007), no. 1, 269–305, DOI 10.1016/j.jet.2006.04.008. MR2888380
[NP10] Klaus Nehring and Clemens Puppe, Abstract Arrowian aggregation, J. Econom. Theory
145 (2010), no. 2, 467–494, DOI 10.1016/j.jet.2010.01.010. MR2886977
[NP11] Klaus Nehring and Marcus Pivato, Incoherent majorities: the McGarvey problem in
judgement aggregation, Discrete Appl. Math. 159 (2011), no. 15, 1488–1507, DOI
10.1016/j.dam.2011.03.014. MR2823907
[NPP13] Klaus Nehring, Marcus Pivato, and Clemens Puppe. Unanimity overruled: Majority
voting and the burden of history. (preprint), 2013.
[NPP14] Klaus Nehring, Marcus Pivato, and Clemens Puppe. The Condorcet set: Majority voting
over interconnected propositions. Journal of Economic Theory (to appear), 2014.
[She09] Saharon Shelah, What majority decisions are possible, Discrete Math. 309 (2009), no. 8,
2349–2364, DOI 10.1016/j.disc.2008.05.010. MR2510361 (2010h:91078)
[Ste59] Richard Stearns, The voting problem, Amer. Math. Monthly 66 (1959), 761–763.
MR0109087 (21 #7799)
[Sze43] Tibor Szele, Kombinatorische Untersuchungen über den gerichteten vollständigen
Graphen (Hungarian, with German summary), Mat. Fiz. Lapok 50 (1943), 223–256.
MR0018404 (8,284i)
[vdV93] Marcel L. J. van de Vel. Theory of convex structures, volume 50 of North-Holland Math-
ematical Library. North-Holland Publishing Co., Amsterdam, 1993.
[Vid99] Laurent Vidu, An extension of a theorem on the aggregation of separable preferences,
Soc. Choice Welf. 16 (1999), no. 1, 159–167, DOI 10.1007/s003550050137. MR1656441
(99k:90053)
Catherine Stenson
Abstract. Many voting systems consist of a set of players who can form
coalitions; a winning coalition is one that can pass a measure. The winning
and losing coalitions can be given by several classes of functions, including
switching functions, threshold functions, weighted voting systems, and simple
games. We define a zonotope Tn related to these functions and describe its
coordinates in terms of a measure of voting power. We also give a voting
interpretation of the coordinates of the derived zonotope T̄n .
1. Introduction
In many different contexts, some collection of players (elected representatives,
company shareholders, states in the U.S. Electoral College, or neurons) may form a
coalition to do something (pass a bill, buy another company, choose a president, or
cause another neuron to fire). Certain coalitions of players are capable of carrying
out this action, and others are not. These coalitions can described by different types
of functions, including weighted voting systems, simple games, switching functions,
and threshold functions. Here we describe the relationship between these functions
and a geometric object known as a zonotope.
The paper is structured as follows. In this section, we define different types of
voting functions. In Section 2, we describe a hyperplane arrangement An that has
been used in previous research on threshold functions. In this work, we compute
the coordinates of the vertices of the zonotope Tn that is dual to An . We show
that they are determined by the critical instances used to define the Banzhaf power
index. We also explain why the points of Tn that correspond to switching functions
that are not threshold functions are not vertices of Tn . Given any zonotope, Mc-
Mullen [8] showed how to construct its derived zonotope; in Section 3 we apply this
construction to Tn to get the derived zonotope T̄n . We compute the coordinates of
T̄n in two different bases and show that they also have a nice voting interpretation.
We begin with some standard definitions. Our set of players is P = {P1 , P2 , . . . ,
Pn }. A coalition C is a subset of P . Each coalition is either winning, meaning it
can carry out an action such as passing a bill, or losing, meaning it cannot. A
coalition C has a characteristic vector xC ∈ {0, 1}n with xi = 1 if Pi is in C and
xi = 0 if it is not. We can think of xC as a vertex of the n-cube.
If each coalition C is designated as either winning or losing, we have a simple
game provided that the set of winning coalitions satisfies monotonicity. That is, if
2014
c American Mathematical Society
89
90 CATHERINE STENSON
v0 = 0 v1 - v0 = 0
IV
II
III
The second geometric object we want to study is the zonotope Tn that is dual
to the hyperplane arrangement An . We begin with some preliminary definitions.
Let X and Y be sets of points in Rd . The Minkowski sum of X and Y is X + Y =
{x + y : x ∈ X and y ∈ Y }. A zonotope is the Minkowski sum of line segments;
it is a special type of polytope. We use [a, b] to denote the line segment between
points a and b.
If a hyperplane arrangement has hyperplanes Hi given by zi · x = 0, 1 ≤ i ≤ m,
the dual zonotope is the polytope Z = [−z1 , z1 ] + [−z2 , z2 ] + · · · + [−zm , zm ]. In
particular, we define
Tn = [−zC , zC ]
C⊆P
to be the zonotope dual to the threshold function arrangement An .
In general, the regions of a hyperplane arrangement are in one-to-one corre-
spondence with the vertices of the dual zonotope. (See, for example, [16, Sec. 6B]
or [17, Sec. 7.3].) In particular, let εi = 1 if a region R is on the positive side of Hi
and let εi = −1 if R is on the negative side of Hi . Then the region R corresponds to
the vertex ε1 z1 +ε2 z2 +· · ·+εm zm of Z. This correspondence between d-dimensional
regions and 0-dimensional faces is part of the duality between the arrangement and
the zonotope. In the case of An , a region R is on the positive (negative) side of HC
when the coalition C is a winning (losing) coalition
for the corresponding threshold
function f . The corresponding vertex of Tn is C zC − B zB , where the first sum
is over all winning coalitions C and the second sum is over all losing coalitions B.
Thus, by Proposition 2.1, we have the following corollary:
Corollary 2.3. The vertices of Tn are in one-to-one correspondence with
threshold functions of n variables.
WEIGHTED VOTING, THRESHOLD FUNCTIONS, AND ZONOTOPES 93
(-1,0) (1,0)
Example 2.4. Figure 2 shows the zonotope T1 for threshold functions of one
variable, which is a parallelogram. We have z∅ = (−1, 0) and z{P1 } = (−1, 1).
The line segments [−z∅ , z∅ ] and [−z{P1 } , z{P1 } ] are shown in grey. The vertex
(0, 1) = −(−1, 0) + (−1, 1) corresponds to the threshold functions in region I of A1
in Figure 1. Similarly, the vertices (2, −1), (0, −1), and (−2, 1) correspond to the
threshold functions in regions II, III, and IV, respectively.
Example 2.5. The function f defined in Example 1.1 has yf =(0, 6, 2, 2, 2).
This looks strikingly similar to our previous computation of critical instances for
this threshold function.
Proof. We have
y0 1
yi + = hf (C) − hf (C).
2 2
C C
Pi ∈C
In the special case that the number of winning coalitions of f is equal to the
number of losing coalitions of f , as in Example 1.1, y0 = 0 and yi = γi for 1 ≤ i ≤ n.
In particular, this happens with functions for which the complement of a winning
coalition
is always a losing coalition and vice versa. For threshold functions with
q > i vi /2, it is possible to have a pair of complementary
coalitions that both
lose, and y0 /2 counts these pairs. Similarly, if q ≤ i vi /2, it is possible to have a
pair of complementary coalitions that both win, and |y0 /2| counts these pairs.
Now we return to considering switching functions that are not also threshold
functions; for brevity, we will call them non-threshold switching functions. We have
seen in Corollary 2.3 that threshold functions correspond to vertices of Tn , and we
are about to see that non-threshold switching functions correspond to points of Tn
that are not vertices. We give two proofs of the following corollary of Proposition
2.1, one that is short and standard and one that is new and gives additional insight
into the geometry.
Corollary 2.7. If g is a switching function that is not a threshold function,
then yg is not a vertex of Tn .
Proof 1. This corollary is an immediate consequence the following standard
result about zonotopes: If w = ε1 z1 + ε2 z2 + · · · + εm zm , where εi = ±1, then w
is a vertex of Z if and only if there is some region of the corresponding hyperplane
arrangement that is on the positive side of Hi when εi = 1 and on the negative side
of Hi when εi = −1. (A more general version of this statement is [2, Proposition
2.2.2].) In the case of Tn , this result plus Proposition 2.1 proves the corollary.
Corollary 2.7 raises some questions not addressed by this first proof. Because
Tn is a zonotope, if yg is a point of Tn but not a vertex of Tn , we should be able
to write yg as a sum of points from the line segments [−zC , zC ] so that for at least
one C, the point is from the interior of the line segment. How do we do this? In
addition, what property of non-threshold switching functions guarantees that they
do not correspond to vertices of Tn ? The goal of the following proof is to answer
these questions.
WEIGHTED VOTING, THRESHOLD FUNCTIONS, AND ZONOTOPES 95
p
q
λi x Ai = μj xBj
i=1 j=1
where
p
q
λi = 1 = μj and λi > 0 and μj > 0 for all i and j.
i=1 j=1
p
q
λi zAi = μj zBj .
i=1 j=1
Thus we can write the point yg in Tn corresponding to g as
p
q
yg = zAi + −zBj + hg (C)zC
i=1 j=1 C
C=Ai ,Bj
p
q
= (λi + (1 − λi ))zAi + −zBj + hg (C)zC
i=1 j=1 C
C=Ai ,Bj
p
q
= (1 − λi )zAi + (−1 + μj )zBj + hg (C)zC .
i=1 j=1 C
C=Ai ,Bj
⎧
⎪
⎪−(k − 1) if C = ∅
⎪
⎨1 if C = {Pi } and Pi ∈ B
(3.1) aB,C =
⎪
⎪−1 if C = B
⎪
⎩
0 otherwise.
The coordinates of z̄C are then aB,C , where B runs over all coalitions of P with
at least two members.
Proposition 3.4. We use the basis for the space of linear dependences of the
points zC as given in (3.1). Let f be a switching function. If ∅ and all the one-
player coalitions {Pi } are losing coalitions for f , then the coordinate of the point
ȳf in T̄ n corresponding to a coalition B is 0 if B is a losing coalition for f and −2
if B is a winning coalition for f .
Proof. The coordinate of the point ȳf in T̄ n corresponding to a coalition B
with |B| = k ≥ 2 is
−(k − 1)hf (∅) + (−1)hf (B) + hf ({Pi }).
Pi ∈B
If ∅ and all the {Pi } are losing coalitions, this is k − 1 − hf (B) − k = −1 − hf (B),
which is 0 if B is a losing coalition and −2 if B is a winning coalition.
The requirement that ∅ and all the one-player coalitions are losing coalitions is
a reasonable one in the context of some real-world voting systems. See, for example,
[14].
Example 3.5. Let n = 3. The rows of the following matrix are the basis
elements aB given by (3.1), where B is the coalition to the left of the row. The
columns of the matrix are the points z̄C , where C is the coalition at the top of the
column.
∅ {P1 } {P2 } {P3 } {P1 , P2 } {P1 , P3 } {P2 , P3 } P
⎛ ⎞
{P1 , P2 } −1 1 1 0 −1 0 0 0
{P1 , P3 } ⎜
⎜ −1 1 0 1 0 −1 0 0 ⎟
⎟
{P2 , P3 } ⎝ −1 0 1 1 0 0 −1 0 ⎠
P −2 1 1 1 0 0 0 −1
For example, the first row represents the linear dependence −(−1, 0, 0, 0) +
(−1, 1, 0, 0) + (−1, 0, 1, 0) − (−1, 1, 1, 0) = (0, 0, 0, 0). If f is the simple game in
which a coalition is winning if and only if it contains both P1 and P2 , then ȳf =
(−2, 0, 0, −2). The first and fourth coordinates correspond to the winning coalitions
{P1 , P2 } and P = {P1 , P2 , P3 } and the second and third coordinates correspond to
the losing coalitions {P1 , P3 } and {P2 , P3 }.
Basis 2. Say a coalition B has k ≥ 2 members. We define the coordinate of aB
in the position corresponding to a coalition C to be
⎧
⎪
⎪ −1 if C = ∅
⎪
⎨1 if C ⊂ B and |C| = k − 1
(3.2) aB,C =
⎪
⎪ −(k − 1) if C = B
⎪
⎩
0 otherwise.
98 CATHERINE STENSON
The coordinates of z̄C are then aB,C , where B runs over all coalitions of P with
at least two members.
Proposition 3.6. We use the basis for the space of linear dependences of the
points zC as given in (3.2). Let f be a switching function. If ∅ is a losing coalition
for f and B is a winning coalition for f , then the coordinate of the point ȳf in T̄ n
corresponding to B is 2 − 2∗(the number of players that are critical to B).
Proof. The coordinate of the point ȳf in T̄ n corresponding to a coalition B
with |B| = k ≥ 2 is
−hf (∅) − (k − 1)hf (B) + hf (B \ {Pi }).
Pi ∈B
If ∅ is a losing coalition and B is a winning coalition, then this sum is
−k + 2+(the number of Pi not critical to B) − (the number of Pi critical to B)
= −k + 2 + k − 2(the number of Pi critical to B)
= 2 − 2(the number of Pi critical to B).
Example 3.7. Let n = 3. The rows of the following matrix are the basis
elements aB given by (3.2), where B is the coalition to the left of the row. The
columns of the matrix are the points z̄C , where C is the coalition at the top of the
column.
threshold functions is an old and challenging one. Results are known for n ≤ 9 [10]
and for special classes of threshold functions. A good reference for early work on
the problem is [9]; two examples of more recent work are [4] and [6].
5. Acknowledgments
Thanks to the anonymous reviewer, whose comments significantly improved the
exposition in this paper. Thanks also to Karl-Dieter Crisman, Michael Jones, and
Michael Orrison for organizing this AMS session and giving me the opportunity to
speak.
References
[1] John F. Banzhaf, Weighted voting doesn’t work: A mathematical analysis, Rutgers Law Re-
view 19 (1965), 317–341.
[2] Anders Björner, Michel Las Vergnas, Bernd Sturmfels, Neil White, and Günter M. Ziegler,
Oriented matroids, 2nd ed., Encyclopedia of Mathematics and its Applications, vol. 46, Cam-
bridge University Press, Cambridge, 1999. MR1744046 (2000j:52016)
[3] Dan S. Felsenthal and Moshé Machover, The measurement of voting power: Theory and
practice, problems and paradoxe, Edward Elgar Publishing Limited, Cheltenham, 1998.
MR1761929 (2001h:91032)
[4] Josep Freixas and Sascha Kurz, The golden number and Fibonacci sequences in the de-
sign of voting structures, European J. Oper. Res. 226 (2013), no. 2, 246–257, DOI
10.1016/j.ejor.2012.10.017. MR3009784
[5] Branko Grünbaum, Convex polytopes, 2nd ed., Graduate Texts in Mathematics, vol. 221,
Springer-Verlag, New York, 2003. Prepared and with a preface by Volker Kaibel, Victor Klee
and Günter M. Ziegler. MR1976856 (2004b:52001)
[6] Sascha Kurz and Nikolas Tautenhahn, On Dedekind’s problem for complete simple games,
Internat. J. Game Theory 42 (2013), no. 2, 411–437, DOI 10.1007/s00182-012-0327-9.
MR3053902
[7] Annick Laruelle and Federico Valenciano, Voting and collective decision-making, Cambridge
University Press, Cambridge, 2008.
[8] P. McMullen, On zonotopes, Trans. Amer. Math. Soc. 159 (1971), 91–109. MR0279689
(43 #5410)
[9] Saburo Muroga, Threshold logic and its applications, Wiley-Interscience [John Wiley & Sons],
New York, 1971. MR0439441 (55 #12334)
[10] The Online Encyclopedia of Integer Sequences, published electronically at http://oeis.org,
Sequence A000609.
[11] L. S. Penrose, The elementary statistics of majority voting, Journal of the Royal Statistical
Society 109 (1946), 53–57.
[12] L. S. Shapley and M. Shubik, A method for evaluating the distribution of power in a committee
system, Amer. Political Sci. Rev. 48 (1964), 787–792.
[13] Alan D. Taylor and William S. Zwicker, Simple games: Desirability relations, trading, pseu-
doweightings, Princeton University Press, Princeton, NJ, 1999. MR1714706 (2000k:91028)
[14] John Tolle, Power distribution in four-player weighted voting systems, Math. Mag. 76 (2003),
no. 1, 33–39, DOI 10.2307/3219130. MR2084115
[15] R. O. Winder, Partitions of N -space by hyperplanes, SIAM J. Appl. Math. 14 (1966), 811–
818. MR0208471 (34 #8281)
[16] Thomas Zaslavsky, Facing up to arrangements: face-count formulas for partitions of space
by hyperplanes, Mem. Amer. Math. Soc. 1 (1975), no. issue 1, 154, vii+102. MR0357135
(50 #9603)
[17] Günter M. Ziegler, Lectures on polytopes, Graduate Texts in Mathematics, vol. 152, Springer-
Verlag, New York, 1995. MR1311028 (96a:52011)
Karl-Dieter Crisman
Abstract. When thinking about choice beyond single winners, social pref-
erence functions are natural to study; these are functions where both input
and output are strict rankings of n items (or possibly ties among several such
rankings). Symmetry is one mathematical way to express fairness, so it makes
sense to study the symmetry of these functions carefully.
Such rankings may be viewed as a permutation of the items; since pair-
wise comparison is also important in voting, a natural combinatorial object
for studying such functions is the permutahedron. This paper analyzes a large
class of social preference functions using the representation theory of the sym-
metry group of the permutahedron. The main result identifies the most sym-
metric possible family in this class, which preserves pairwise information fully;
it is the one-parameter family that connects the Borda Count and the Kemeny
Rule.
1. Introduction
1.1. Why Social Preference Functions? Choice questions are typically
about aggregating individual preferences into a ‘societal’ preference. For example,
with n choices, A1 , A2 , . . . , An , any individual voter’s preference is represented as
a strict transitive ranking such as A1 A3 · · · A2 ; some mathematical rule
then yields an aggregate result. Different types of outcomes, whether singletons or
choice functions, yield different categories of functions.
In some natural situations the actual outcome should be a ranking, or some
related structure. For instance, a group could choose officers (Chair, Secretary, and
Treasurer) from three candidates a nominating committee gives them. The offices
might have a priority order (like for succession), but their priority status is not
the only point. Even more interesting, one might have a list of factories to inspect
for an internal audit. Here, the cyclic order of a visiting schedule likely is more
important than which factory actually is the first one visited in the year.
In any similar case, it is reasonable to assume that the output of the function is
one or more strict rankings, just as in a voting function the output is one or more
candidates. We call such a function a social preference function. The most famous
s.p.f. is probably the Kemeny Rule.
A special class of social preference functions has been receiving some attention
in recent work, especially in the two pairs of articles [23,29] and [34,35]. This class,
2014
c American Mathematical Society
101
102 KARL-DIETER CRISMAN
called simple ranking scoring functions (SRSFs – see section 2.1), fills exactly the
same role in the class of social preference functions as the usual positional scoring
rules do in the class of voting rules1 .
The point is that in many situations where choice matters we may wish to
consider not just the relative rank of the candidates, but where they fit into the
overall order. So one goal of this paper is to introduce an interesting generalization
of several well-known procedures of value in contexts beyond single-winner elections.
1.3. Context for this work. Within the last few years, representation the-
ory has become a tool to reframe and powerfully extend previous classifications.
Orrison and his students [7] have done so in voting theory, while work of Hernández-
Lamoneda, Juárez, and Sánchez-Sánchez [12] gives similar results in cooperative
game theory. These techniques are also used in work of Bargagliotti and Orrison
in nonparametric statistics [3].
In these papers, representations of Sn allow generalization with fewer technical
challenges, with more insight into why the results are true. But there is more to
combinatorics than permutations, and more to fairness than the symmetric group.
Pairwise comparisons between candidates have been a cornerstone of voting theory
analysis since Condorcet, and one may note that a ranking of candidates is not
simply a permutation, but an ordering.
1 We note that [15] and some recent preprints by Pivato and Nehring address an even more
of their functions; in [4], a major assumption is that voters’ preferences of subsets of candidates
obey various (anti-)symmetric partial orders.
3 Party primary systems are not neutral; over the whole election cycle, a candidate in an
uncontested primary has (at least in principle) an advantage in winning the whole thing.
THE BORDA COUNT, THE KEMENY RULE, AND THE PERMUTAHEDRON 103
The permutahedron has the right amount of structure for analyzing social pref-
erence functions while keeping pairwise behavior in mind. Its symmetry group sheds
light on the structure of neutral SRSFs. The ‘extra’ symmetry of the permutahe-
dron corresponds precisely to the well-known concept of reversal symmetry (see
section 2.3). So the third goal of this paper is to prove significant results about
neutral SRSFs utilizing the basic representation theory of the symmetry group of
the permutahedron (summarized in the Appendix).
The most important result in this classification is an explicit characterization of
the most symmetric possible rules in this family – a characterization which connects
the two most important members of it.
Main Theorem. If a neutral SRSF is compatible with pairwise information
and fully preserves this information, then it is a rule along the one-parameter family
of procedures connecting the Borda Count and the Kemeny Rule.
‘Pairwise information’ means information about head-to-head comparisons be-
tween alternatives; see [7], Definition 4.6, and Theorem 5.12 for full details. By
adding one final symmetry, one can characterize the Borda Count among SRSFs
in the same way as is usually done among positional scoring rules, or rules relying
only on pairwise information. Conversely, one can start moving beyond the ‘Borda
versus Condorcet’ ways of thinking and start to explore how much of each behavior
one might want in a choice procedure.
The remainder of the paper addresses the goals as follows:
• Review social choice definitions and introduce the permutahedron
• Motivate machinery with explicit statements and examples for n = 3
• Introduce all remaining needed concepts, and prove theorems for all n
• Look forward to questions opened up by this work, including other discrete
structures of interest in social choice
Figure 1
turns a positional scoring rule into an SRSF. Intuitively, the SRSF score for v with
respect to a ranking r is the sum of points each candidate in ranking r gets for vote
v in the scoring rule, weighted by the position of the candidate in the ranking r.
We can make all this quite concrete with three candidates. For a positional
scoring rule, if v = XY Z and r = ABC, then
3
s(v, r) = (3 − i)t(XY Z, r(i))
i=1
= 2 · t(XY Z, A) + 1 · t(XY Z, B) + 0 · t(XY Z, C)
= 2t(XY Z, A) + t(XY Z, B) ,
so if we have a system with w = (u, w, 0), then this yields
v ABC ACB CAB CBA BCA BAC
s(v, ABC) 2u + w 2u 2w w u 2w + u
106 KARL-DIETER CRISMAN
Figure 2 gives a visualization using a Saari-like triangle, where each separating line
defines the border between rankings with X Y and vice versa. To use it for other
v
= v, one would permute the whole triangle with the permutation σ such that
σ(v) = v . Then computing the SRSF for each r = XY Z can be done visually as
well, by taking the dot product of the profile and the weighting triangle with ABC
on the XY Z spot.
Example 2.5. It is instructive to see what plurality looks like as an s.p.f. Since
= 0 unless r(i) = v(1), in which case we get s(v, r) = n − i, the score for
t(v, r(i))
r is v∈L(A) p(v)s(v, r), which is the sum of n − 1 points for each voter who ranks
r(1) first, n − 2 points for each one who ranks r(2) first, and so forth.
For n = 3, with the profile from Example 2.4, we see that ABC is again the
aggregate preference.
r v∈L(A) p(v)s(v, r)
ACB 4·2+3·0 =8
ABC 4·2+3·1 = 11
BAC 4·1+3·2 = 10
BCA 4·1+3·0 =4
Example 2.6. On the other hand, the Borda Count gives BAC as the winning
ranking. Putting u = 2 and w = 1 gives the following SRSF-style scores.
r v∈L(A) p(v)s(v, r)
ACB 4·4+3·1 = 19
ABC 4·5+3·2 = 26
BAC 4·4+3·4 = 28
BCA 4·2+3·5 = 23
Just as the analysis of [7, 18, 19] considers the point totals to be vital informa-
tion in understanding voting function symmetry, we consider the point totals for
SRSFs to be vital to unlocking the structure of social preference functions.
2.3. The Permutahedron and Reversal Symmetry. The input profiles
and output scores of SRSFs are both essentially elements of an n!-dimensional
vector space over Q. Since we identify rankings with permutations, we identify
this
space with the group ring QSn , which is the set of all formal Q-sums σ∈Sn qσ σ.
THE BORDA COUNT, THE KEMENY RULE, AND THE PERMUTAHEDRON 107
K (1, 1, 1, 1, 1, 1) S (3)
(2,1)
BA (2, 2, 0, −2, −2, 0) S1
BB (0, −2, −2, 0, 2, 2)
(2,1)
RA (1, 1, −2, 1, 1, −2) S2
RB (−2, 1, 1, −2, 1, 1)
C (1, −1, 1, −1, 1, −1) S (1,1,1)
We use group elements in the order10 e, (2 3), (1 2 3), (1 3), (1 3 2), (1 2). The
representation-theoretic notation for the subspaces is in the right column; this no-
tation makes it clear the Reversal and Borda components, whose basis elements are
not orthogonal to each other, must be considered as inherently two-dimensional.
(The use of BX and C to indicate profiles should not cause ambiguity with generic
candidate names.) Finally, note that the sum of the entries of each vector (except
the first) is zero; such a vector is called sum-zero, and such profiles are called profile
differentials, inasmuch as they do not represent actual voters.
Let f be a neutral SRSF. Recall (Subsection 2.2) that f is uniquely defined by
all its s(·, r), which we may consider to be a vector of weights s; we will call the
function fs to indicate this fact. Thus the scores for all rankings r are simply the
dot products σ(s) · p from before, so fs gives a linear transformation from QS3 to
itself. This is given by the scores for each ranking as in the previous section11 .
But fs is not just any linear transformation. Since we pointed out earlier that
s(v, r) = s(σ(v), σ(r)) for any permutation σ ∈ Sn , fs must preserve all group
symmetries; so, fs is what is called an S3 -module homomorphism (see Section 4.1
for more detail). This means SRSFs are subject to the following basic fact:
symmetry defines exactly the subgroup C2 which gives a direct product, since left and right
multiplication will commute.
10 Corresponding to the usual voting theory order ABC, ACB, CAB, CBA, BCA, BAC.
⎛ ⎞
3 2 1 0 1 2
⎜ 2 1 0 1 2 3 ⎟
⎜ ⎟
⎜ ⎟
⎜
11 So that for KR we would have the matrix ⎜ 1 0 1 2 3 2 ⎟ ⎟
⎜ 0 1 2 3 2 1 ⎟.
⎜ ⎟
⎜ ⎟
⎝ 1 2 3 2 1 0 ⎠
2 3 2 1 0 1
110 KARL-DIETER CRISMAN
Example 3.1. Figure 6 shows a few examples of weighting vectors. The first
two are the BC (rescaled) and the KR, which have already been met before.
The third appears to be an amusing nonsense procedure where s = (2, 0, 0, 1, 1, 2).
Here, fs (BA ) = −2RA ; that is, a voter profile which overwhelmingly approves of
rankings with A first ahead of other rankings would have a result overwhelmingly
favoring any ranking with A in second place!
On the other hand, the last procedure (with s = (4, 3, 1, 0, 1, 2)) is a ‘reasonable’
variant on the Borda Count which is trying to imitate plurality a little bit by
deemphasizing Y XZ as an outcome by voters who chose XY Z – perhaps with a
view toward making sure ACB is the outcome more often with profiles like the one
from Example 2.4. (It is not a positional scoring rule.)
Nonetheless, this s has a nonzero dot product with the s in the third procedure,
so for some profiles with a large BA component relative to the others, it will exhibit
much the same bizarre behavior and should also be called into question.
Our main interest is in decomposing the images of SRSFs, but the algebra also
identifies building blocks for sensible procedures. The following table gives basis
vectors for s which send each component (subspace) only to a scalar multiple of
itself and kill everything else.
K (1, 1, 1, 1, 1, 1)
C (1, −1, 1, −1, 1, −1)
B (2, 1, −1, −2, −1, 1)
R (2, −1, −1, 2, −1, −1)
12 The X basis vector of these components must go to a linear combination of the X vectors
(or their orthogonal complements within the B and R modules). This is because these vectors
have symmetry under any σ which switches alternatives Y and Z, while the others do not, and
fs is an S3 -module homomorphism.
THE BORDA COUNT, THE KEMENY RULE, AND THE PERMUTAHEDRON 111
The other two dimensions’ worth of s are not reasonable – for instance, the
third procedure in Example 3.1 is a basis for any procedure which behaves like it.
Notice that simply rescaling the Borda Count weighting vector so it is sum-
zero gives the prototype for methods which preserve only the Basic component, as
expected from [18]. In fact, by ignoring the part of fs coming from K above in a
systematic way (because it will always add the same amount to each ranking r),
we can reduce our attention to sum-zero s. Likewise, we only care about relative
scores, which leads to the following definition (as in [7], [17], and elsewhere).
Definition 3.2. We consider two procedures fs and fs with sum-zero weight-
ing vectors s, s to be equivalent if s = ks , which we will indicate by s ∼ s . We
will say that two neutral SRSFs are essentially different if they are not equivalent
in this way.
Proposition 3.3. The space of essentially different neutral SRSFs for n = 3
is four dimensional.
Informal Proof. There are six total dimensions. Taking the quotient by K
to get a sum-zero weighting vector removes one; unscaling removes another.
In this case, plurality has s ∼ (1, 1, −1, −1, 0, 0). We will consider the ‘reversed’
plurality s ∼ (−1, −1, 1, 1, 0, 0) to be equivalent, since it will literally have a re-
versed outcome which can be derived from plurality – even though it looks quite
different to the voter. (We are intentionally ignoring issues such as unanimity to
get the broadest possible result at this point.)
Many theorists argue that the Condorcet (C profile) component should be ig-
nored (or, what is equivalent, considered a complete tie). The idea is that any pro-
file non-orthogonal to this component runs the risk of giving credence to subspaces
where each candidate appears in each position in the ranking an equal number of
times. If we do ignore it, we lose another dimension:
Proposition 3.4. The space of neutral SRSFs for n = 3 which ignore the
Condorcet component is three dimensional.
In the contexts mentioned in the introduction, insisting on this restriction does
not always make sense. The famous Condorcet example of p with p(ABC) =
p(BCA) = p(CAB) = 1, p(XY Z) = 0 otherwise need not be a paradox in the
committee example; there, it is plausible that the voters would prefer to have a
random tiebreaker among just these three rankings (as opposed to all six), guaran-
teeing at least one of their succession preferences.
Naturally, not every application will demand preserving the Condorcet compo-
nent, and we are not arguing that the Condorcet criterion or Condorcet extensions
are always appropriate. Rather, it seems reasonable that in situations where the
overall ranking matters more than the winner, or where there is potential for the
ranking to influence (or even determine) a cycle of events, it is advantageous to
keep this component13 .
One of the ways in which we can ensure that we do not throw away this infor-
mation is by means of the concept of being compatible with pairwise information.
Definition 4.6 gives a full account, but for n = 3 it is sufficient to remark that such
an fs kills everything from R and K; the intuition is that only B and C preserve the
13 We briefly mention cyclic orders in Example 6.1.
112 KARL-DIETER CRISMAN
information we get from tallying all the head-to-head pairwise matchups between
candidates. By counting dimensions once again one can compute that, modulo
equivalence:
Proposition 3.5. The space of neutral SRSFs for n = 3 which are compatible
with pairwise information is two dimensional.
Unfortunately, the procedure with s = (2, 0, 0, 1, 1, 2) (recall, where fs (BA ) =
−2RA ) is in this space. So this is not a panacea.
Now we bring in the permutahedron, with its reversal symmetry. All of the
components of the decomposition of QS3 have a natural action of ρ as well (so
ρ ρ
they are “P3 -modules”). It is not hard to see that (BX ) = −BX while (RX ) =
RX , so although B and R are equivalent under S3 , they are not equivalent under
reversal. The implication in the social preference function context is that with
reversal symmetry, BX and RX must not go to each other. Thus we have:
Proposition 3.6. The space of essentially different neutral SRSFs for n = 3
which obey reversal symmetry is two dimensional.
Informal Proof. There are six total dimensions, and as usual we eliminate
two by considering sum-zero essentially different procedures. Ordinarily, a basis
element BX of the basic subspace could be sent to some element of the reversal
space, and vice versa, but if not, then quotienting out eliminates two additional
dimensions.
The bizarre SRSF s = (2, 0, 0, 1, 1, 2) is not allowed, nor is anything which
shares a nontrivial piece of it. However, SRSFs having reversal symmetry lead to
the same problems one gets from positional scoring procedures which do not ignore
the reversal component. Here is a somewhat subtle example – again, this is not a
positional scoring procedure.
Example 3.7. The weighting vector s = (15, 2, 0, 11, 0, 2) puts appropriately
heavy weight on XY Z and some weight on its neighbors. Consider the profile
p = (9, 6, 6, 3, 0, 6) with these weights; ABC (the first ranking in the profile) will
be given 15 · 9 + 2 · 6 + 0 · 6 + 11 · 3 + 0 · 0 + 2 · 6 = 192 points, while BAC (the last
ranking in the profile) receives 15 · 6 + 2 · 0 + 0 · 3 + 11 · 6 + 0 · 6 + 2 · 9 = 174.
The final score vector is (192, 120, 174, 156, 84, 174); note that ACB loses by
a significant margin, even to CBA, despite the pairwise tally showing A the clear
victor and B tied with C.
With counterintuitive results coming no matter what restrictions we place on
the symmetry, what happens if we demand maximum symmetry from a neutral
SRSF?
Proposition 3.8. The space of essentially different neutral SRSFs for n = 3
which obey reversal symmetry and are compatible with pairs is a one-dimensional
family of procedures.
For n = 3, one may think of this as giving the space of fs of the sum-zero
vectors of weights in Figure 7.
Clearly both BC (a = 2) and KR (a = 3) are part of this continuum; since it is
one-dimensional, they define it as well. This is our main result in the case n = 3.
Using a vector of weights from Figure 7 alone can still lead to a nonsense
method; for example, letting a = 0 gives something less than useful. However, it
THE BORDA COUNT, THE KEMENY RULE, AND THE PERMUTAHEDRON 113
is very hard to decide what sort of other, non-algebraic, conditions are natural.
One can impose unanimity-style conditions such as a ≥ 1, under which any profile
in which all voters have the same preference yields that preference as a winning
ranking. Nonetheless, even then it is possible for SRSFs in this space to give actual
outcomes (winning preference orders) which are different from both BC and KR.
Example 3.9. The profile p = (1, 2, 5, 0, 0, 0) only has non-zero preference for
half the rankings (ABC, ACB, and CAB). For a given a, this means the scores
for the relevant potential winning rankings will be:
Computation BC KR a = 1.5
ABC +a · 1 + 1 · 2 − 1 · 5 0 -1 -1.5
ACB +1 · 1 + a · 2 + 1 · 5 12 10 9
CAB −1 · 1 + 1 · 2 + a · 5 16 11 8.5
CBA −a · 1 − 1 · 2 + 1 · 5 0 1 2
The Kemeny Rule and Borda Count both give CAB as the winning outcome, but
with a = 1.5 the result is ACB.
However, it turns out that if 2 < a < 3, this is not possible – all SRSFs of this
type will have the same outcome as KR or BC (or both). Demonstrating this is a
standard and tedious chase of inequalities to yield contradictions from other cases,
so we omit the proof.
For the reader who enjoys an exercise, here is an unscaled, non-zero-sum (and
hence more intuitive to the layman) example of an SRSF ‘between’ the most familiar
examples.
Example 3.10. The procedure in Figure 8 yields a tie between ABC and ACB
(the BC and KR outcomes, respectively) on the profile p = 6BA + 2BB − 7C −
3RC + 12K.
Hints: which 2 < a < 3 would this correspond to? What contribution will K
make in the finally tally? What about C? Can you now reconstruct the relevant
part of the profile and finish the computations as in Example 3.9?
These examples show that there is real depth in the concept of simple ranking
scoring functions. In order to avoid the problem in Example 3.9, we must use
algebra – for instance, we could send the profile differential C to a positive scalar
multiple of itself. To ensure that the outcome is ‘between’ those of the well-known
methods (if this is even a good idea), we must come up with appropriate Pareto-type
conditions.
114 KARL-DIETER CRISMAN
It is by no means obvious how much weight to give the C component, but once
one bothers with the pairwise information, its effects must be considered. If there is
some dissatisfaction in the community about the Borda Count completely ignoring
this information, there is also dissatisfaction with methods that give it as much
weight as the Kemeny Rule does, frequently exhibiting paradoxes as the no-show
paradox.
With this spectrum, each potential ‘customer’ of methods on this ‘Borda-
Kemeny spectrum’ can decide this for themselves how much or how little to take this
into account; the algebraic access to these procedures makes this analysis possible.
4. Representations
Let A be a set of n candidates, Sn be the symmetric group on n elements, and
so forth. Our results for n ≥ 3 may be summarized as follows.
• The space of essentially different neutral SRSFs which are compatible with
pairs is 12 (n + 1)(n − 2) = 12 (n2 − n + 2) dimensional (Theorem 5.1).
• If these also have reversal symmetry, we are reduced to 14 (n2 − 5) or
4 (n − 4) dimensions for odd and even n, respectively, which is about half
1 2
For example, when n = 4, the Borda component for A has 3, 1, −1, −3 voters for
rankings with A in first through fourth place respectively. That is, BA is a profile
(differential) with 3 voters each for ABCD, ABDC, ACBD, ACDB, ADBC, and
ADCB, but −1 votes each for BCAD, BDAC, CBAD, etc., and so forth. The
profile Alt2,A has −1, 3, −3, 1 for the same places14 ; the analogous symmetric profile
grants −1, 1, 1, −1 votes, respectively, to rankings with A in first through fourth
places.
In all cases, these have the structure that the sum over all candidates of each
of these profiles is zero, so that each component, as a vector space, is (n − 1)-
dimensional. We can start to see what role these play with the following examples.
Example 4.1. Let’s see what happens to the Borda and Sym components
under plurality for n = 4. Recall that if r = XY ZW , plurality is the SRSF with
s = s(·, r) giving 3 weighting points for any ranking with X in first place, 2 for Y
in first, 1 for Z and 0 for W .
What happens to BA under this system? Any ranking r of the form AY ZW will
have 18 total compatible voters in BA (three each of the six possible permutations
with A in first place) giving 3 points each. How many voters will give 2, 1, or 0
points to r? Once we pick Y , Z, or W to be in first place, there are two of each kind
of those voters with A in second, third, and fourth place, respectively, weighted by
14 So −1 votes for ABCD, ABDC, ACBD, ACDB, ADBC, and ADCB, . . .
THE BORDA COUNT, THE KEMENY RULE, AND THE PERMUTAHEDRON 117
1, −1, −3 in the profile – giving 2(1 − 1 − 3) ‘total voters’ giving each 2, 1, and 0
points. Subtracting this from 54 yields 36 points for r of the form AY ZW .
In the same manner, any r of the form XAZW has the same 18 voters giving
two points each, and the rest giving 3, 1, and 0 points each, for 12 points per
this type of ranking r; by symmetry, we see that BA will be sent to 12BA by the
plurality function.
Example 4.2. Using the same strategy, the symmetric component SA has −6
(= −1 · 6) voters granting 3 points each to r = AY ZW . Considering again the two
rankings each with Y , Z, or W first and A in the other spots, we have 2(1 + 1 − 1)
voters giving 2, 1, and 0 points, for a total of −12 points. One can compute that
rankings of the form r = XAZW will receive −4 points each, and by symmetry we
see that SA is sent to −4BA by plurality.
But the B, Alt, and Sym components also happen to be irreducible Pn -modules!
It is not hard to see that under reversal symmetry the B and Alt components reverse
sign, while the Sym components are unchanged, so we use the notation S (n−1,1),+
for the isomorphism class of the Sym modules, while the B and Alt components
are called S (n−1,1),− .
What about generalizing the discussion for n = 3 to the components of S (n−2,1,1) ?
Most of these vanish under even the weakest symmetry we’ll discuss, so the final
component to describe corresponds to (1, −1, 1, −1, 1, −1) when n = 3.
Definition 4.3. For each pair {X, Y } of candidates, we will define CXY , the
Condorcet component, as follows15 :
Let {XY 1} denote the set of all rankings which begin X Y , let {XY 2}
denote the set of all rankings which begin with X ? Y , and continue up
through {XY (n − 1)}. Then CXY is the profile where, for all cyclic permutations
of the elements in {XY i} (such as ABC, BCA, CAB for three candidates), we
assign n − 2i voters to those rankings.
Notice that {XY i} is simply the reversal of all {XY (n − i)}, so there is redun-
dancy. For n = 3, this does give the usual Condorcet component, while for n = 4
it gives the Condorcet component in the form of the CXY s of Saari, as for i = 2 we
get zero voters. A convenient basis of dimension n−1 2 is given by
CA1 A2 , CA1 A3 , · · · , CA1 An−1 , CA2 A3 , · · · , CA2 An−1 , · · · , CAn−2 An−1
where one notes that holding X or Y constant and summing over all candidates in
the other variable gives zero.
15 This is obviously indebted to Saari’s original Condorcet component and Zwicker’s ‘spin’
component [33] as well as Saari’s CXY in Section 6 of [18]. Indeed, the spaces only differ when
n ≥ 5, which is probably why they are first completely described here. See also Sections 4.4.3 and
4.5 of [20] where they are implicit in a discussion about the ‘old’ Condorcet components.
118 KARL-DIETER CRISMAN
rotational symmetry; hence, the procedure sends RA = (1, 1, −2, 1, 1, −2) to zero.
Definition 4.8. We say that a neutral SRSF fs fully preserves pairs if, as a
linear transformation, it sends the subspace S (n−1,1) ⊕ S (n−2,1,1) from pairs com-
patibility to exactly the same subspace.
The subspace in the definition must be sent to an isomorphic subspace (or zero),
by Schur’s Lemma. However, in general it can be sent to any isomorphic component
THE BORDA COUNT, THE KEMENY RULE, AND THE PERMUTAHEDRON 119
– for instance, the basis vector of a Borda component could be sent to some unusual
linear combination of basis vectors of Alternating and Symmetric components. But
such components are themselves full pairwise ties. So if a procedure fully preserves
pairs, then the pairwise information from the Borda component is in some sense
not ‘wasted’ – a key ingredient in the statement of Theorem 5.12.
Example 4.9. There are rules which fully preserve pairs while not in fact
being compatible with pairs, such as any positional scoring rule other than the
Borda Count. Using plurality in Example 4.1, the Sym component was not killed;
still, the Borda component is sent to a scalar multiple of itself (and the Condorcet
component to zero).
Example 4.10. Using the same s as in Example 4.7, we can explicitly compute
where B and C are sent. In fact, this SRSF happens to send C to itself, not even to
a scalar multiple; KR doubles the influence of the Condorcet component. Similarly,
the influence of BX is multiplied by seven (by eight for BC and by six for KR).
In general, an s from the general continuum in Figure 7 sends BX to (2a+2)BX
and sends C to (2a − 4)C.
What about positional scoring rules? We state the SRSF analogue of the
remark before Theorem 4 in [7] (the proof is essentially the same).
120 KARL-DIETER CRISMAN
Theorem 5.5. The effective space of any SRSF which is a positional scoring
rule (unless its vector of weights is sum-zero) is isomorphic to S (n) ⊕S (n−1,1) , where
the S (n−1,1) component may be any piece of the whole (n − 1)S (n−1,1) piece of QSn .
In fact, the copy of S (n−1,1) will depend on the structure of s, as pointed out
there. Nonetheless, the SRSF point of view is quite enlightening.
Theorem 5.6. With reversal symmetry obeyed, a positional scoring rule must
have its effective space be S (n) ⊕ S (n−1,1),− .
This certainly makes sense, and this would also directly impact s, as the weights
w(i) + w(n + 1 − i) must be invariant with respect to i in that case, or to equal
zero if S (n) → 0.
In the next few more propositions, these techniques reprove that scoring rules
using pairwise information must essentially be the Borda Count – a now-standard
result.
Proposition 5.7. Any profile orthogonal to S (n) ⊕ (n − 1)S (n−1,1) has the
property that it is sum-zero not just as a whole, but also for the subset of rankings
r such that r(j) = X, for any j and any X.
Proof. The structure of the S (n−1,1) components are such that each has a
basis of vectors p such that p(v) is the same for all v with v(j) = X (this follows
immediately from the table in Section 4.2). Any vector which has value 1 for all
v(j) = X and zeros elsewhere is in the subspace given by the sum of the (one-
dimensional) S (n) subspace and the right S (n−1,1) component. A profile orthogonal
to this subspace must necessarily fulfill both requirements of the proposition.
As a result, the score allocated to ranking r from any profile orthogonal to
S (n) ⊕ (n − 1)S (n−1,1) will be
⎛ ⎛ ⎞⎞
n
n−1
⎝ ⎝(n − i) t(v, r(i))⎠⎠
k=1 i=1 v(k)=X
where the innermost sum is counted with (possibly negative) multiplicity, and hence
must be zero by the proposition. This leads us to the generalization of what we
discovered for plurality with n = 4 in Example 4.1:
Proposition 5.8. The image of any positional scoring rule will be in S (n) ⊕ B.
Just as moving to the algebraic viewpoint gives us cardinal, not just ordinal,
information, this gives us more information than before. Namely, positional scoring
rules are extremely limited in their outcome potential; depending on your point of
view, this might be good or bad. It certainly limits the types of ties one can have,
for instance; it also means a lot of complete pairwise tie information (such as the
Sym components!) is being interpreted dubiously.
Since the intersection of (n − 1)S (n−1,1) and B ⊕ C is B, we now have the
following, which extends Theorem 6 of [7] (itself an extension of various results of
Saari) to the SRSF context.
Corollary 5.9. An SRSF which is both a scoring rule and relies only on
pairwise information has an effective space and image of S (n) ⊕ B. This must be
essentially the same as the Borda count (or its reversal).
We are now ready to state and prove the main theorem.
THE BORDA COUNT, THE KEMENY RULE, AND THE PERMUTAHEDRON 121
5.2. The Borda Count and the Kemeny Rule. Even if a neutral SRSF
has lots of nice properties, there are still weird things that can happen, as the
following two examples demonstrate.
Given the desire to avoid the behavior in the preceding examples, one must
seriously consider the remaining alternatives; this is the essence of the algebraic
point of view of voting theory. Once we have bothered to get a reasonable effective
space of profiles by relying only on pairwise information, we will probably want to
send that effective space to itself. This is the point of combining compatibility with
pairs with the property of fully preserving pairs, and of the main theorem of the
paper.
Theorem 5.12. Suppose a neutral SRSF is compatible with pairs and fully
preserves pairs. Then (up to essential difference) this SRSF is on a one-dimensional
continuum of procedures; this is precisely the continuum of procedures given by the
Borda Count and Kemeny Rule.
If in addition the Condorcet component goes to zero, the rule is the Borda
Count.
For convenience, we take s(r, r) = 1 and s(r, ρ(r)) = −1. For n = 3 the
continuum with parameter t takes the shape16 (1, t, −t, −1, −t, 1), with t = 1/3
being Kemeny and t = 1/2 being Borda.
For n = 4 the continuum is more subtle. Assuming again that s(r, r) = 1 and
s(r, ρ(r)) = −1, for a ranking r which is one neighbor swap away from v (as in the
comment after Definition 2.3), we would have s(v, r) = 2t. For most r at distance
two we would have t points, but s(XY ZW, Y XW Z) = 4t − 1. We would have
s(XY ZW, Y W XZ) = s(XY ZW, ZXW Y ) = 0, but the others at distance three
having ±(3t−1), and those further away having negative points. The Kemeny Rule
is at t = 1/3, the Borda Count at t = 2/5.
Notice that this spectrum already has more complexity; for instance, 4t − 1
could be greater than or less than t depending on whether t was greater than
1/3 or not. As a result, it is much more difficult combinatorially to obtain sharp
outcomes; here is a useful one based on a weak Pareto-type condition.
Proposition 5.14. If we assume that the partial order on the permutahedron
is respected by s, then we must have 1/4 ≤ t ≤ 1/2.
Example 5.15. What is particularly interesting about this is that there are
reasonable (from this point of view) methods both ‘between’ KR and BC, but also
on either side of them along the continuum! For instance, if t = 1/4, the partial
order is respected but the CXY Condorcet components are sent to (small scalar
multiples of) −CXY . This is an intriguing possibility if one wanted a procedure
that intentionally controverted the expectations of cyclic profiles slightly.
Those who have studied voting methods from the linear-algebraic or geometric
perspective have usually advocated for real-life use of the Borda Count based on its
intense symmetry – especially since it takes cycles like A B C A and treats
them as complete ties. For instance, [7] was motivated by trying to find analogues
to BC for partial ranking information. For methods intended to provide a winner
or set of winners, this seems reasonable to do, even if it might violate the Condorcet
criterion.
Theorem 5.12 is significant because we now have a broader range of options
for the Condorcet component in symmetric procedures. The Borda Count is the
dividing line between Condorcet components being sent to themselves or their neg-
atives, so one might reject social preference functions ‘beyond’ it (for n = 4, with
t < 2/5) like the one in Example 5.15. But for procedures ‘beyond’ KR, it is not
clear that there is an upper bound on how much influence should be given to the
Condorcet components; one might want to approximate voting on a cyclic order
itself. The end of Section 4 of [35] reports that when n = 8 the KR and BC give
radically different outcomes; so the spectrum gains even more importance.
The spectrum (and these methods) should be useful in considering manipula-
tion. It is ‘classical’ (originally due independently to Gibbard and Satterthwaite;
see [27] for a comprehensive survey) that situations exist in nearly any choice sys-
tem where a voter can cast a vote other than his or her actual preference and
come out with a more preferred result. Geometric-algebraic methods have been of
use in analyzing BC and KR (see for instance [21]); a taste of similar analysis is
demonstrated next. (The proof is exceedingly tedious, but straightforward.)
Let us return to the ideas of Section 1.1. Choice theory is about choice, not
just winners. In a situation of a board of directors, it is entirely reasonable for
voters for ABCD to say that they would prefer BCAD to ADCB as an outcome;
with BCAD at least most of the succession is preserved, whereas with ADCB their
first-choice candidate wins but the rest of it is counter to their preference. We’ve
already mentioned why it might be appropriate to leave a component of a profile
which looks like (1, 0, 1, 0, 1, 0) as a tie between the rankings ABC, BCA, CAB
rather than a tie between candidates A, B, C.
There is one additional possible interpretation of this worth considering. The
permutahedron is an abstract combinatorial object, but it may be embedded con-
sistently in space in many ways. Zwicker has pointed out (in [35]; see also [24]) that
one of the equivalent weighting vectors s for both KR and BC come from square
distances between its vertices in different embeddings. Might there be a way to
think of some of the other methods along this spectrum as part of the continuum
stretching (for n = 3) the regular hexagon to the permutahedral vertices of the
cube? (And if so, can we find a geometric interpretation of t > 1/2?) Ideally,
this would give a natural connection to the representation theory as well – and the
combinatorial structure has given us the tools.
6. Looking Forward
6.1. Algebra in the Service of Choice. Although it is a highlight, the
Borda-Kemeny Spectrum is only part of the story; the SRSFs discussed are a good
starting point, but not an end in themselves. Where might the types of thinking
in this paper lead us?
Example 6.1. Suppose that the objects of voting really are cyclic orders; that
is, each voter is allowed to select a cycle such as A B D C A. Natural
contexts for this include:
• Seating preferences around a round table
• Rotating long-term site visit schedules for observation or inspection
• Scheduling space in a 24-hour facility
The combinatorial object which represents these in the same way the permu-
tahedron represents rankings is called the cyclic-order graph 17 . With only three
objects, it is nearly trivial (the only cyclic orders are A B C A and
A C B A, and these are equivalent if we add rotational/reversal symmetry).
However, even with four ‘candidates’, there are six configurations – see Figure 10.
This graph is the skeleton of the octahedron, which has symmetry group S4 ×
C2 (for reasons unrelated to the permutahedron symmetry group). The ‘profile
space’ is only six-dimensional, though, not even close to the size of the order of the
symmetry group. As a result, the irreducible components of the profile space are
quite different.
17 See [28].
124 KARL-DIETER CRISMAN
One is the trivial component where all orders receive the same number of votes.
There is also a two-dimensional one, roughly equivalent to the Sym component for
SRSFs, which is generated by a profile differential with two votes for ABCDA and
its reversal, and −1 votes for each of the others. The third component is three-
dimensional and is reminiscent of the Borda component – it is generated by profile
differentials with one voter for a cyclic order, and minus one voter for its reversal.
If we focus on keeping only the ‘Borda-like’ component, we discover the space
of possible procedures does not have symmetry around any square (4-cycle) in the
graph. Instead, the weighting vectors look like (a, b, c, −b, −c, −a), where the −a is
at the reversal of the cyclic order for a.
This example exemplifies our point of view in this paper. It is not possible
to justify the more obvious sum-zero vectors (a, 0, 0, 0, 0, −a) without introducing
additional arguments – just like we had to introduce compatibility with pairs to
focus attention on the most interesting procedures. The voting theory informed the
algebra.
At the same time, there is some real voting theory, not just algebra, waiting
to be done! What are natural non-algebraic conditions for ‘nice’ voting systems
on cyclic orders? How do people really choose to sit around a table? And does
person X really care if person Y is at her right or left, as long as they are sitting
close together? (In this context, it does matter – a different graph would ask what
happens if it doesn’t matter.) These are all questions that require input from the
choice community – just as finding new, appropriate questions to ‘ask’ the Borda-
Kemeny spectrum SRSFs will take some time.
6.2. Future Work and Acknowledgments. There are many opportunities
for further work here.
• What are the natural generalizations of Pareto and unanimity in the con-
text of the permutahedron, and what properties would they imply? (This
is not obvious.)
• The continuum of procedures can be different from BC and KR – how
different? To what extent do they share desirable properties – especially
for n ≥ 5 (see [35])?
• What about truncated, tied, or incomplete preferences in this context?
• Cyclic order graphs, let alone representations of their automorphism
groups, have not been studied much beyond [28]; what can we learn about
voting in this context?
• What about voting with respect to the symmetries of some arbitrary graph
on a set of alternatives?
THE BORDA COUNT, THE KEMENY RULE, AND THE PERMUTAHEDRON 125
• Can one give an explicit geometric model for the spectrum, as toward the
end of Section 5.2?
6.2.1. Acknowledgments. Before acknowledging humans, I wish to explicitly
point out that mathematical software (I used Sage [26]) was essential to discov-
ering these rather subtle patterns, particularly when it came to more than n = 4
candidates. As Archimedes pointed out18 , it is much easier to prove something
once you know what to prove! Thanks also go to:
• The organizers of a session where a very early version of this work was
presented.
• Bill Zwicker for pointing out the connection to the Kemeny Rule and the
Borda Count, and for many valuable references and discussion.
• Mike Orrison for enthusiastic support and references.
• The Gordon College Faculty Development Committee for the sabbatical
leave during which much of this research took shape.
• Readers of preprints, the referee, and the editor for conscientious com-
ments, structural ideas, and very helpful improvements in exposition.
7. Appendix
7.1. Representations of the Permutahedron. Our main goal for this sec-
tion is stating and proving Theorem 7.1 about precise irreducible decompositions
of QX. We also collate several results about the representation theory of Pn and
Sn , providing proofs where these are not well-known. The book [13] is a canonical
reference, but we echo the notation from the more closely related [8].
It is classical that the irreducible representations of Sn (over a field of charac-
teristic zero) are classified by partitions λ of n, each called S λ . For instance, for
n = 4 there are precisely five, labeled S (1,1,1,1) , S (2,1,1) , S (2,2) ,4
S (3,1) , and S (4) . The
regular representation QSn decomposes, as a Sn -module, as λ dim(S λ )S λ .
The regular representation of Sn is given by the action of Sn on the vector
space QX, where X is the set of permutations of {1, 2, . . . , n}. But considered as
the set of vertices of the permutahedron, there will be a Pn ∼ = Sn × C2 action on
X, giving QX a Pn -module structure as well.
Since Pn has such a nice structure, we know (see e.g. [1], Example 15.2) that
each S λ will be isomorphic (as an Sn -module) to two different irreducible Pn -
modules, which we will call S λ,+ and S λ,− to indicate how ρ acts on them (namely,
ρS λ,+ = S λ,+ but ρS λ,− = −S λ,− ).
It turns out that most of this decomposition of QX lies in the kernel from the
perspective of voting theory. The important pieces are the S (n−1,1) and S (n−2,1,1)
components, which are the ones pairwise-respecting procedures and points-based
procedures are affected by.
18 With respect to both Democritus and Eudoxus deserving credit for showing that a cone or
pyramid has one-third the volume of the respective cylinder, see e.g. [11] for discussion.
126 KARL-DIETER CRISMAN
Irreducible Number
1+(−1)n
2 n−1−
(n−1,1),+ 1
S
2
1+(−1)n
S (n−1,1),− 1
2 n − 1 + 2
n−1 2 n−1 3
S (n−2,1,1),+ 1
2 2 −
1 n−1
2 n−1
2 3
S (n−2,1,1),− 2 2 + 2
This result is all the theorems need; these numbers are given for small n in the
following table.
transposes as possible, which is the cycle decomposition type of ρ; since the cycle
decomposition type determines the conjugacy class of a permutation, σ must be in
the conjugacy class of ρ! Otherwise there are no fixed points at all.
To simplify the computation if there are, assume σ = ρ. Then any p which
3 + 1 − i) = n + 1 for all i will work. Once we have chosen p(i)n+1
has p(i) 2+ p(n for
1 ≤ i ≤ n2 , that fixes the others.2 We 3 can choose p(1) to be anything except 2
(if that is an integer), which is 2 n2 choices, and which2then 3 removes
p(n) from
consideration; then p(2) can be any of the remaining 2 n2 − 1 choices, and so
2 3
on. Thus the number of fixed points for g = (ρ, ρ) is 2 2 n2 ! .
n
2 n 3 n
To summarize, π(e, e) = n!, π(ρ, ρ) = 2 ! 2 2 , and π(g) = 0 for all other
elements of the group. Let f (g) be the size of the conjugacy class of g. The
conjugacy class of the identity is always just itself, while the conjugacy class of
(ρ, ρ) is the set of all (σ, ρ) where σ have the same cycle type as ρ. The easiest
way to think of such
2 3 a σ is as a permutation which then has parentheses every two
entries, yielding n2 pairs; then we divide by the number of symmetries, of which
there are 2 for each pair, and then divide by the permutations of the pairs.
Now decomposing the character π with respect to any irreducible character χ
can be done directly:
1
(π, χ) = (π(e, e) · f (e, e) · χ(e, e) + π(ρ, ρ) · f (ρ, ρ) · χ(ρ, ρ) + 0)
2 · n!
) *
1 5 n 6 n n! 1
= n! · χ(e, e) + ! 2 2 · 2 3 n χ(ρ, ρ) = (χ(e, e) + χ(ρ, ρ)) .
2 · n! 2 n
! 2 2
2
2
The following two propositions are enough to prove the voting assertions.
n
Proposition 7.2. If χ = χS (n−1,1),− , then χ(ρ, ρ) = 1+(−1) 2 , which is to
say it alternates between 0 and 1 for n odd and even.
2 3
Proposition 7.3. If χ = χS (n−2,1,1),− , then χ(ρ, ρ) = n−12 , which is to say it
goes through positive integers in order and repeats each value twice, once for n odd
and once for n even.
Before proving these statements, we finish the proof of Theorem 7.1. We al-
ready know that χS (n−1,1),± (e, e) = n − 1 and χS (n−2,1,1),± (e, e) = n−1
2 . For the +
components, the theorem is immediate. For same calculation with the − compo-
nents, it suffices to recall that χS λ,− (σ, ρ) = −χS λ,+ (σ, ρ).
this with the fact that −BAn = i=n BAi , that means the matrix looks like
⎛ ⎞
1 0 0 ... 0 0
⎜ ⎟
⎜1 0 0 . . . 0 −1⎟
⎜ ⎟
⎜1 0 0 . . . −1 0 ⎟
⎜ ⎟
⎜ . . . . . . . . . . . . . . . . . . . . . . . . . .⎟
⎜ ⎟
⎜ ⎟
⎝1 0 −1 . . . 0 0⎠
1 −1 0 . . . 0 0
which will clearly have the correct trace.
be rewritten as
− CAn+1−j Ak = − CAn+1−j Ak + CAk An+1−j .
k=n+1−j,n n+1−j<k<n 0<k<n+1−j
This sum contributes to the trace precisely if there is a CA1 Aj as one of the terms,
which can only happen if n + 1 − j = 1 and k = j, or if k = 1 and n + 1 − j = j.
The first implies that j = n, which was not one of the original basis elements, but
the second option implies that j = n+1
2 . So if n is odd we must add one more.
Thus we arrive at a total trace of
2 3 7 8
n
− 1 = n−1 , n even n−1
Tr(conj. by ρ) = 22n 3 2 2 n 3 2 n−1 3 = .
2 −1+1= 2 = 2 , n odd 2
{v|v(k) = X, v(i) = r(
), i
= k, n + 1 − k} and
{v|v(n + 1 − k) = X, v(i) = r(
), i
= k, n + 1 − k}
have the same size. That means that the terms in s(v, r) corresponding to these
sets will cancel, since they correspond to (n −
)t(v, r(
)) when r(
) = v(i), and
there are equal numbers of these for k and n + 1 − k as long as i
= k, n + 1 − k.
So, for each v such that v(k) = r(j) = X and v(n + 1 − k) = r(i) (where
obviously i
= j), the non-canceling part of the term is
(n + 1 − 2k) [(n − j)t(v, r(j)) + (n − i)t(v, r(i))] ,
which is okay since when k = 1 we correctly get 0 as the inside coefficient in s(v, r).
Sum this back up and substitute in the Borda Count values of t(v, r(j)) = t(v, X) =
n−k n−(n+1−k) k−1
n−1 and t(v, r(i)) = n−1 = n−1 to get
⎛ ⎞
n
⎝ n−k k−1 ⎠
(n + 1 − 2k) (n − j) + (n − i) .
n−1 n−1
k=1 v(k)=X=r(j), v(n+1−k)=r(i)
There are (n − 1)! different v such that v(k) = X, and hence (n − 2)! different
v such that v(k) = X and v(n + 1 − k) = r(i) in the above sum. Then we get
⎛ ⎞
n
n − k k − 1
(n + 1 − 2k) ⎝(n − 1)! (n − j) + (n − 2)! (n − i) ⎠.
n−1 n−1
k=1 i=j
The reader will notice that the sum only depends on j, as one would hope. If
we increase j by one, the difference between two of these scores is
⎛ ⎞
n k − 1⎠
(n − 2)! (n + 1 − 2k) ⎝(n − j)(n − k) + (n − i)
n−1
k=1 i=j
⎛ ⎞
n k − 1⎠
− (n − 2)! (n + 1 − 2k) ⎝(n − (j + 1))(n − k) + (n − i)
n−1
k=1 i=j+1
130 KARL-DIETER CRISMAN
simplifies to
n
k−1
n−1
(n − 2)! (n + 1 − 2k) (n − 1)(n − k) + (n − 1) + (n − n) + 2 n−i
n−1 i=2
k=1
n
k − 1 n(n − 1) (n − 1)(n − 2)
=(n − 2)! (n + 1 − 2k) (n − 1)(n − k) + +
n−1 2 2
k=1
n
=(n − 2)!(n − 1)2 (n + 1 − 2k) = 0 .
k=1
Proof of Proposition 5.3. For the Kemeny Rule, recall that the SRSF
function is given by s(v, r) = a,b∈A δ(v, r, a, b), where δ is 1 if a b by both
rankings v and r, and is 0 otherwise; clearly this relies only on pairwise informa-
tion. Let us see where it sends the (new) Condorcet components.
Using the notation above for {XY i} we see that for a given ranking r (with
r ∈ {XY j}) the score for r is
n−1
(n − 2i) δ(v, r, a, b) .
i=1 v∈{XY i} a,b∈A
This depends only on j since the sums are over all v ∈ {XY i}. For each v ∈ {XY i}
such that a v b and a r b, reversing r will cause δ to go from 1 to 0, but will
send δ from 0 to 1 for the reversal of v; since this reversal is in {XY (n − i)}, the
score for the reversal of r is
n−1
n−1
(n − 2i) δ(v, r, a, b) = (2k − n) δ(v, r, a, b)
i=1 v∈{XY (n−i)} a,b∈A k=1 v∈{XY k} a,b∈A
i = 2, X Y . . . and Y . . . X). Hence there are, for {XY i}, n − 1 − i + n−2 i=0 i =
(n−1)(n−2)
n−1−i+ 2 possibilities out of a total of n(n − 2) total; the difference is
(n − 1)(n − 2) (n − 1)(n − 2)
n−1−i+ − n(n − 2) − n − 1 − i +
2 2
(n − 1)(n − 2)
=2 n−1−i+ −n2 +2n = 2n−2−2i+(n−1)(n−2)−n2 +2n = n−2i.
2
will be
n−1
n−1
n!
(n − 2i) [(n − 2i)(n − 3)!] = (n − 3)! (n − 2i)2 = .
i=1 i=1
3
Proof of Proposition 5.13. We use the same decomposition as above for
the Borda Count, over v(k) = X = r(j). As above,
⎛ ⎞
n
⎝ (n + 1 − 2k) δ(v, r, a, b)⎠ .
k=1 v(k)=X a,b∈A
This also only depends on j (here, r is a fixed ranking with r(j) = X, but nothing
else known) because of the sum over all v with each v(k) = X. However, it is useful
to focus on a specific r for proving this sends Borda to Borda.
First let us observe what happens to the score when r is reversed. For each v
such that a v b and a r b, we will get +(n + 1 − 2k), depending on v(k) = X.
But if r is reversed, these go away, and the reversal of v will have δ = 1. This will
exactly give the negative of the original score, because if v(k) = X, then the reversal
v has v (n + 1 − k) = X, which means it will contribute +(n + 1 − 2(n + 1 − k)) =
−n − 1 + k = −(n + 1 − k) to the score.
We also need to have a fixed change in score when r changes. Let r be the
same ranking as r but with r (j + 1) = X. As with the CXY components, we will
suppose the change is X changing to X. The number of potential places for the
candidate to occur after X in a ranking in the set {v(i) = X} is n − i, and the
number of places before is i − 1, so the difference is n + 1 − 2i. For the remaining
spots there are (n − 2)! possibilities, so there are (n + 1 − 2i)(n − 2)! net rankings
v in {v(i) = X} which will lose δ = 1 (or, if (n + 1 − 2i)(n − 2)! < 0, gain δ = 1).
Thus the difference in the scores given by
⎛ ⎞
n
⎝ (n + 1 − 2k) δ(v, r, a, b)⎠
k=1 v(k)=X a,b∈A
will be
n
n
(n + 1)!
(n + 1 − 2k)2 (n − 2)! = (n − 2)! (n + 1 − 2k)2 = .
3
k=1 k=1
132 KARL-DIETER CRISMAN
Proof of Proposition 5.8. Let us see what happens to the profile p which
has value p(v) = 1 for v(j) = X and zeros otherwise. In this case,
⎛ )n−1 *⎞ )n−1 *
n
⎝ (n − i)t(v, r(i)) ⎠ = (n − i)t(v, r(i))
k=1 v(k)=X i=1 v(j)=X i=1
Call the weighting vector w. Then note that in the second sum inside the paren-
theses, for a given i, v(k) = r(i) the same number of times (to be precise, (n − 2)!
times), other than v(j), of course. So we can rewrite this
⎛ ⎞
n−1
⎝(n −
)w(j) + (n − i)t(v, r(i))⎠
v(j)=X v(j)=X i=1, i=
n−1
n
= (n − 1)!(n −
)w(j) + (n − i)(n − 2)! w(k)
i=1, i= k=1, k=j
⎡ ⎤
n
n−1
= (n − 2)! ⎣(n − 1)(n −
)w(j) + w(k)(n − i)⎦
k=1, k=j i=1, i=
n
= (n − 1)!(n − 1) w(k)
k=1
so that it goes to the Borda component alone if the weighting vector is sum-zero,
otherwise it is ‘shifted’ by a multiple of the sum of the weights.
References
[1] J. L. Alperin and Rowen B. Bell, Groups and representations, Graduate Texts in Mathemat-
ics, vol. 162, Springer-Verlag, New York, 1995. MR1369573 (96m:20001)
[2] Kenneth J. Arrow, A difficulty in the concept of social welfare, The Journal of Political
Economy 58 (1950), no. 4, 328–346.
[3] Anna E. Bargagliotti and Michael E. Orrison, Linear rank tests of uniformity: understanding
inconsistent outcomes and the construction of new tests, J. Nonparametr. Stat. 24 (2012),
no. 2, 481–495, DOI 10.1080/10485252.2011.649282. MR2921148
[4] Steven J. Brams and Peter C. Fishburn, Approval voting, The American Political Science
Review 72 (1978), no. 3, 831–847.
[5] Vincent Conitzer, Matthew Rognlie, and Lirong Xia, Preference functions that score rankings
and maximum likelihood estimation, IJCAI, 2009.
THE BORDA COUNT, THE KEMENY RULE, AND THE PERMUTAHEDRON 133
[6] Karl-Dieter Crisman, The symmetry group of the permutahedron, College Math. J. 42 (2011),
no. 2, 135–139, DOI 10.4169/college.math.j.42.2.135. MR2793146
[7] Zajj Daugherty, Alexander K. Eustis, Gregory Minton, and Michael E. Orrison, Voting, the
symmetric group, and representation theory, Amer. Math. Monthly 116 (2009), no. 8, 667–
687, DOI 10.4169/193009709X460796. MR2572103 (2011k:91058)
[8] Persi Diaconis, Group representations in probability and statistics, Institute of Mathemat-
ical Statistics Lecture Notes—Monograph Series, 11, Institute of Mathematical Statistics,
Hayward, CA, 1988. MR964069 (90a:60001)
[9] Yan-Quan Feng, Automorphism groups of Cayley graphs on symmetric groups with gen-
erating transposition sets, J. Combin. Theory Ser. B 96 (2006), no. 1, 67–72, DOI
10.1016/j.jctb.2005.06.010. MR2185979 (2006i:05082)
[10] Allan Gibbard, Manipulation of voting schemes: a general result, Econometrica 41 (1973),
587–601. MR0441407 (55 #14270)
[11] Thomas Heath, A history of Greek mathematics. Vol. I, Dover Publications Inc., New York,
1981. From Thales to Euclid; Corrected reprint of the 1921 original. MR654679 (84h:01002a)
[12] L. Hernández-Lamoneda, R. Juárez, and F. Sánchez-Sánchez, Dissection of solutions in co-
operative game theory using representation techniques, Internat. J. Game Theory 35 (2007),
no. 3, 395–426, DOI 10.1007/s00182-006-0036-3. MR2304546 (2008e:91008)
[13] Gordon James and Adalbert Kerber, The representation theory of the symmetric group,
Encyclopedia of Mathematics and its Applications, vol. 16, Addison-Wesley Publishing Co.,
Reading, Mass., 1981. With a foreword by P. M. Cohn; With an introduction by Gilbert de
B. Robinson. MR644144 (83k:20003)
[14] John G. Kemeny, Mathematics without numbers, Daedalus 88 (1959), 571–591.
[15] Roger B. Myerson, Axiomatic derivation of scoring rules without the ordering assump-
tion, Soc. Choice Welf. 12 (1995), no. 1, 59–74, DOI 10.1007/BF00182193. MR1321126
(96h:90040)
[16] Donald G. Saari, Geometry of voting, Studies in Economic Theory, vol. 3, Springer-Verlag,
Berlin, 1994. MR1297124 (96d:90022)
[17] Donald G. Saari, Basic geometry of voting, Springer-Verlag, Berlin, 1995. MR1410265
(98d:90040)
[18] Donald G. Saari, Mathematical structure of voting paradoxes. I. Pairwise votes, Econom.
Theory 15 (2000), no. 1, 1–53, DOI 10.1007/s001990050001. MR1731508 (2001e:91063)
[19] Donald G. Saari, Mathematical structure of voting paradoxes. II. Positional voting, Econom.
Theory 15 (2000), no. 1, 55–102, DOI 10.1007/s001990050001. MR1731509 (2001e:91064)
[20] Donald G. Saari, Disposing dictators, demystifying voting paradoxes, Cambridge University
Press, Cambridge, 2008. Social choice analysis. MR2449532 (2009h:91001)
[21] Donald G. Saari and Vincent R. Merlin, Changes that cause changes, Soc. Choice Welf. 17
(2000), no. 4, 691–705, DOI 10.1007/s003550000050. MR1778699 (2001g:91047)
[22] Donald G. Saari and Vincent R. Merlin, A geometric examination of Kemeny’s rule,
Soc. Choice Welf. 17 (2000), no. 3, 403–438, DOI 10.1007/s003550050171. MR1762588
(2001g:91046)
[23] Tuomas Sandholm and Vincent Conitzer, Common voting rules as maximum likelihood esti-
mators, UAI, 2005.
[24] Joe Santmyer, For all possible distances look to the permutohedron, Math. Mag. 80 (2007),
no. 2, 120–125. MR2301879
[25] Amartya Sen, The impossibility of a paretian liberal, The Journal of Political Economy 78
(1970), no. 1, 152–157.
[26] W. A. Stein, M. Hansen, et al., Sage Mathematics Software (Version 4.0), The Sage Devel-
opment Team, 2009, http://www.sagemath.org.
[27] Alan D. Taylor, Social choice and the mathematics of manipulation, Outlooks, Cambridge
University Press, Cambridge, 2005. MR2157533 (2006f:91004)
[28] D. R. Woodall, Cyclic-order graphs and Zarankiewicz’s crossing-number conjecture, J. Graph
Theory 17 (1993), no. 6, 657–671, DOI 10.1002/jgt.3190170602. MR1244681 (94j:05050)
[29] Lirong Xia and Vincent Conitzer, Finite local consistency characterizes generalized scoring
rules, IJCAI, 2009.
[30] H. P. Young and A. Levenglick, A consistent extension of Condorcet’s election principle,
SIAM J. Appl. Math. 35 (1978), no. 2, 285–300. MR0504073 (58 #20635)
134 KARL-DIETER CRISMAN
[31] Zhao Zhang and Qiong-xiang Huang, Automorphism groups of bubble-sort graphs and modi-
fied bubble-sort graphs (English, with English and Chinese summaries), Adv. Math. (China)
34 (2005), no. 4, 441–447. MR2182393 (2006j:05094)
[32] Günter M. Ziegler, Lectures on polytopes, Graduate Texts in Mathematics, vol. 152, Springer-
Verlag, New York, 1995. MR1311028 (96a:52011)
[33] William S. Zwicker, The voters’ paradox, spin, and the Borda count, Math. Social Sci. 22
(1991), no. 3, 187–227, DOI 10.1016/0165-4896(91)90023-K. MR1149504 (93f:90017)
[34] William S. Zwicker, A characterization of the rational mean neat voting rules, Math. Comput.
Modelling 48 (2008), no. 9-10, 1374–1384, DOI 10.1016/j.mcm.2008.05.029. MR2459707
(2009m:91050)
[35] William S. Zwicker, Consistency without neutrality in voting rules: When is a
vote an average?, Math. Comput. Modelling 48 (2008), no. 9-10, 1357–1373, DOI
10.1016/j.mcm.2008.05.031. MR2459706 (2010d:91047)
Double-interval societies
1. Introduction
Consider the voting model of Berg et. al.[1] in which a political spectrum X is
viewed as a continuum, with liberal positions on the left and conservative positions
on the right, and in which each voter v “approves” an interval of positions along
this line. For example, a tolerant moderate might approve a wide interval near the
middle of the line, while an intolerant partisan may approve a narrower interval
near one of the ends.
More formally, a society is a spectrum X together with a set of voters V and
a collection of approval sets {Av }, one for each voter. A point on the spectrum X
is called a platform. In our situation, we imagine X to be R, and each approval set
Av is a closed interval that represents the set of all platforms that v approves.
Now suppose that every pair of people can agree on some platform; that is, their
intervals overlap. In this situation, Helly’s Theorem [3] implies that there exists a
point on the line that lies in everyone’s approval set, i.e., there is a platform that
everyone approves. Thus a strong hypothesis (pairwise intersecting sets) produces
a strong conclusion (a point in all the sets). However, in voting theory, we are
usually not looking for unanimity, but may be satisfied with a platform that has
high approval ratio: the fraction of voters that approve this platform.
2014
c American Mathematical Society
135
136 M. M. KLAWE, K. L. NYMAN, J. N. SCOTT, AND F. E. SU
Various authors have relaxed the hypotheses. Berg et. al.[1] define a linear
(k, m)-agreeable society in which voter preferences again are modeled by closed
intervals in R1 . In this society, given any set of m voters, there exists a subset of
k voters whose approval intervals mutually intersect. They prove that there must
k−1
exist some platform with approval ratio m−1 . Another generalization by Hardin
[2] looks at approval intervals on a circle rather than a line, and finds that with
(k, m)-agreeability, the approval ratio of the society is at least k−1
m .
A,
B,
C,
D,
Finally, define the approval ratio of a society to be the approval number of S divided
by the number of voters in S.
The main question we address in this paper is: what is the minimal approval
ratio of a pairwise-intersecting, double-interval society with n voters?
Examples suggest that the minimal approval ratio of such societies is 1/3; that
is, there is always a platform that will get at least a third of the votes. Our results
in this paper attempt to clarify this intuition.
We will first examine a family of double-interval societies with low approval ra-
tios that have regular patterns of interval overlap. These arise from the construction
of what we call double-n strings, defined in the next Section. The combinatorics
of such strings are quite nifty and provide a lower bound for the approval ratio of
societies in this family (Theorem 3.6) as well as an upper bound (Theorem 3.7) for
societies in this family. Roughly speaking, the double-n strings produce societies
with asymptotic approval ratios between 0.348 and 0.385.
DOUBLE-INTERVAL SOCIETIES 137
We will also prove a general lower bound for the approval ratio of any pairwise-
intersecting, double-interval society in Theorem 4.1, which shows the approval ratio
is always greater than 0.268.
Then we ask if we can find specific societies with lower approval ratios than
the ones arising from double-n strings, and discover that there are such examples.
We find them by modifying the construction that comes from double-n strings. See
Table 1. However, all of these examples have approval ratio equal to 1/3.
A
B
C
D
E
B
E
C
A
D
For simplicity, without loss of generality assume that if the first occurrence of
symbol m occurs at position i in a double-n string, then all symbols at positions
1 ≤ j < i are less than m (otherwise this condition can be satisfied by a permutation
of the symbols in the double-n string). From Lemma 3.1, it is sufficient to consider
double-n strings that have diameter less than n/2. It is also easy to obtain the
lower bound Δ ≥ 1/3 as shown in the following lemma.
Lemma 3.2. Let r be a positive integer. We have δ(3r + 1) ≥ r.
DOUBLE-INTERVAL SOCIETIES 139
0 15r+1 15r+1
d d d
B1 B2 B3
Let x be the minimal number such that the second occurrence of x is not in
B1 . Then x ≤ r since k1 ≤ r − 1. Without loss of generality suppose the second
occurrence of x is in B2 . Again, by Corollary 3.5 there are at most
3d + r − n ≤ 3(8r − 1) + r − 23r = 2r − 3
symbols i with 1 ≤ i ≤ 7r + 1 other than x in B2 , so k2 ≤ 2r − 2.
Similarly, let y be the smallest symbol (in value) whose second occurrence is
in B3 (i.e., is not in B1 or B2 ). There are at most k1 + k2 symbols in B1 ∪ B2 ,
so y ≤ 3r − 2. Using Corollary 3.5 one last time, we see that there are at most
3d + (3r − 2) − n ≤ 3(8r − 1) + (3r − 2) − 23r = 4r − 5 symbols i
= y with
1 ≤ i ≤ 7r + 1 in B3 , so k3 ≤ 4r − 4. However, this is a contradiction: we needed
k1 + k2 + k3 ≥ 7r + 1, but
k1 + k2 + k3 ≤ (r − 1) + (2r − 2) + (4r − 4) = 7r − 7.
Therefore we could not have d < 8r, proving the theorem.
A general argument showing δ(br) ≥ ar, for large r, leads to the inequalities
b < 3a and 23a ≤ 8b. Thus the lower bound of Theorem 3.6 is the best possible
asymptotic bound using this argument. Now we turn to the upper bound.
9n:
Theorem 3.7. For any n > 0, there exists a double-n string with diameter
d ≤ 5 13 − 1. Hence the asymptotic approval ratio for double-n strings is bounded
above by 5/13.
Proof. Note that the double-13 string
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 11, 6, 12, 13, 5, 4, 7, 11, 10, 9, 2, 3, 13, 12, 8
has diameter 4, meaning that any two symbols in it appear somewhere in the double-
13 string separated by no more than three other elements.
9 n : This yields a general
construction for double-n strings of any length. Let k = 13 . Then replacing each
symbol i in the above string with the substring
k(i − 1) + 1, k(i − 1) + 2, . . . , ki,
and removing any symbols in the resulting string that are greater than n, yields
a double-n string. An example of this string for n = 34 (k = 3) is shown in Fig-
ure 4. Because the diameter of the above double-13 string is 4, any two symbols
1 ≤ i < j ≤ n are within substrings that are separated by at most three substrings
of length k. Also, i and j are at worst on the far ends of their substrings, giving a
maximum total distance between i and j in the new string of
;n< ;n< ;n<
3 + 2 −1 =5 − 1.
13 13 13
DOUBLE-INTERVAL SOCIETIES 141
(1, 2, 3)(4, 5, 6)(7, 8, 9)(10, 11, 12)(13, 14, 15)(16, 17, 18)(19, 20, 21)(22, 23, 24)
(25, 26, 27)(28, 29, 30)(1, 2, 3)(31, 32, 33)(16, 17, 18)(34)(13, 14, 15)(10, 11, 12)
(19, 20, 21)(31, 32, 33)(28, 29, 30)(25, 26, 27)(4, 5, 6)(7, 8, 9)(34)(22, 23, 24)
Proof. Let Ai and Ai represent the left and right intervals, respectively, of
voter i’s approval set in the n-voter society S. Without loss of generality we may
assume no two interval endpoints coincide. For any interval I, define numbers L(I),
R(I), B(I), and C(I) to keep track of the number of other intervals that intersect I
in various ways. Let L(I) count the number of other intervals that, of two endpoints
of I, contain only the left endpoint. Let R(I) count the number of other intervals
that, of two endpoints of I, contain only the right endpoint. Let B(I) count the
number of other intervals that contain both endpoints of I. Let C(I) count the
number of other intervals that intersect I but contain neither endpoint of I, and
are hence in the “center” of I.
For example, in Figure 1, we see that L(A ) = 2, R(A ) = 0, C(A ) = 3, and
B(A ) = 0. Also L(C ) = 0, R(C ) = 1, C(C ) = 0, and B(C ) = 1. Since each set
must intersect all n − 1 other sets,
L(Ai ) + L(Ai ) + R(Ai ) + R(Ai ) + C(Ai ) + C(Ai ) + B(Ai ) + B(Ai ) ≥ n − 1.
142 M. M. KLAWE, K. L. NYMAN, J. N. SCOTT, AND F. E. SU
Then clearly
n
[L(Ai ) + L(Ai ) + R(Ai ) + R(Ai ) + C(Ai ) + C(Ai ) + B(Ai ) + B(Ai )]
i=1
(4.4) ≥ n(n − 1).
Note that an interval J covers both endpoints of another interval I and con-
tributes 1 to the count B(I) exactly when I is the in the center of J and contributes
1 to the count C(J). This implies:
n
n
(4.5) [B(Ai ) + B(Ai )] = [C(Ai ) + C(Ai )] .
i=1 i=1
Notice that given an approval number a(S), each interval may have at most
a(S) − 1 other sets intersecting its left endpoint. This gives an initial bound
n
[L(Ai ) + L(Ai ) + B(Ai ) + B(Ai )] ≤ 2n(a(S) − 1).
i=1
However, if the 2n intervals are ordered by the left endpoint, then the kth interval
under this ordering from left to right can have at most k − 1 intervals intersecting
its left endpoint, not a(S) − 1. Thus we need to adjust the formulas above, to
obtain:
n
a(S)(a(S) − 1)
[L(Ai ) + L(Ai ) + B(Ai ) + B(Ai )] ≤ 2n(a(S) − 1) − ,
i=1
2
n
a(S)(a(S) − 1)
[R(Ai ) + R(Ai ) + B(Ai ) + B(Ai )] ≤ 2n(a(S) − 1) − .
i=1
2
Adding these equations and applying equation (4.5) yields
n
[L(Ai ) + L(Ai ) + R(Ai ) + R(Ai ) + C(Ai ) + C(Ai ) + B(Ai ) + B(Ai )]
i=1
≤ 4n(a(S) − 1) − a(S)(a(S) − 1).
So by equation (4.4), we see
Values of a(S) and the corresponding bounds on n and the approval ratio
derived from equation (4.3) are given in Table 1.
DOUBLE-INTERVAL SOCIETIES 143
Table 1. On the left, this table shows for a given approval number
the largest n that is given by inequality (4.3) as well as the resulting
bound on the approval ratio derived from inequality (4.1). On
the right, this table shows, for a given approval number, known
examples of the largest n that has this approval number and the
observed approval ratio in that case, obtained by a modification of
a double-n string construction.
+A + C + B − A + D − C + A − B − D + C + B − C + D − B − D − A.
The society shown in Figure 5 provides such a society. This example was
derived from the double-8 string
If each interval in the string overlaps two intervals on each side, this arrangement
is missing the adjacencies AG, BE, BF , CA, DG, DG and CH and has duplicate
adjacencies BC, CD, DC, DE, DF , EG, F G, and GH. By doing a series of moves
that interchanges endpoints in such a way as to introduce missing adjacencies (at
the expense of duplicate adjacencies) without increasing the approval number, we
144 M. M. KLAWE, K. L. NYMAN, J. N. SCOTT, AND F. E. SU
A
B
C
E
D
G
F
H
A
E
D
F
B
C
G
H
[4] J. N. Scott, Approval ratios of two-intersecting double-interval societies. Harvey Mudd Senior
Thesis. (2011).
1. Introduction
The goal of this article is to examine the following voting situation: from a pool
of n candidates, a committee of size k is to be formed. Every voter will submit an
unordered list of j candidates that they would prefer be on the committee. We will
assume throughout that j ≤ k < n.
Once this voter data is collected, a committee will be chosen via some proce-
dure. One way to select a winner of the election is to look for a most “popular”
committee: a committee with the highest proportion of voter “approval”. (In case
of ties, there may be more than one most popular committee.) We must make this
notion precise.
Let us say that a voter approves a committee if it contains the list of j candi-
dates that the voter submitted. Note that a voter may approve several committees,
since several different k-element subsets will contain a given j-element subset if
j < k.
We wish to answer questions regarding whether it is always possible, assuming
certain conditions on the voters’ preferences, to find a committee approved by a
certain proportion of voters. Phrased another way, we can ask how popular is a
most popular (most-approved) committee? Can we make any minimal guarantees
for the approval proportion of a most popular committee, given some geometric
condition on the voter preferences?
Such questions have been addressed in voting contexts where each person is
choosing just one candidate. Berg et al. [1] initiated the study of agreeability con-
ditions for voting preferences over a linear political spectrum. In this situation,
candidates are chosen from a line of possibilities, and each person is allowed to
2014
c American Mathematical Society
147
148 MATT DAVIS, MICHAEL E. ORRISON, AND FRANCIS EDWARD SU
specify an interval of candidates that they find acceptable. In other words, their
approval set is an interval in the real line. They call a society super-agreeable if
every pair of voters has a mutually agreed candidate (i.e., their approval sets over-
lap) and they also consider other agreeability conditions that specify ways in which
voter preferences, when taken in small subsets, are locally similar. They show
how these produce global conclusions about how popular a most popular candidate
must be, and give minimal guarantees for how many approval sets a most-approved
candidate must lie in.
These ideas have been generalized to other kinds of geometric spaces that can be
viewed as political spectra—circles [10], trees [8], multi-dimensional spaces [5]—as
well as other kinds of approval sets [12] and agreeability conditions [4].
This paper considers similar questions in a new context: voting for committees.
(See [2, 6, 7, 9, 11, 13, 14] for several different approaches to the study of voting for
committees.) The set of n candidates is finite, and not viewed as a geometric space.
However, voters are now allowed to specify a list of size j, and there is a geometric
notion of how close two lists may be. The resulting geometric space of lists is called
a Johnson graph J(n, j) where each point is a j-element subset of an n-element set
(see [3] for more on Johnson graphs).
Unlike the context of Berg et al. [1], here the space in which preferences are
expressed is different from the space of potential outcomes of an election, which
are the k-element subsets of the candidates, and which can be thought of in terms
of the graph J(n, k). In particular, each voter’s list in J(n, j) produces an approval
set in J(n, k) consisting of all committees that contain the list. In this paper, we
ask: what is the minimal guarantee for the popularity of a most popular committee
in J(n, k), and if we place conditions on how similar, or “agreeable”, the submitted
lists are in J(n, j), how does that change the guarantee?
We first demonstrate, in Theorem 2.2, a sharp lower bound for any voter dis-
tribution. We then show, in Theorem 5.1, that if all the votes lie within a “ball”
in the space of lists, then the bound we can guarantee improves. We conclude with
some extensions and open questions.
2. Definitions
Let us identify the candidates with the elements of the set [n] = {1, 2, . . . n}.
Any j-element subset of [n] will be called a list. (Note that a list here is just a set,
and is unordered.) Any k-element subset of [n] will be called a committee.
We choose to think of given voter data as a probability distribution P on the
set of lists, which we call the voter distribution. In particular, this distribution will
specify for each list
the voting proportion P (
)—this is the fraction of voters who
submitted the list
. We shall use the same notation to describe the probability of
a collection of lists, so that P (A) describes the likelihood that a voter chose one of
the lists in the collection A.
If C is a committee, let
πP (C) = P (
)
⊆C,||=j
denote the approval proportion of committee C with respect to voter data P : this
is the fraction of voters that approve a given committee if the voter distribution on
the set of all lists is P . We will write simply π(C) if P is understood.
VOTING FOR COMMITTEES IN AGREEABLE SOCIETIES 149
Example 2.1. Assume that candidates 1 through 7 are being considered for
a 4-person committee, and voters are asked to submit lists of three candidates.
Define P so that P ({1, 2, 3}) = 7/15, while P (
) = 2/15 for the lists {4, 5, 6},
{4, 5, 7}, {4, 6, 7}, and {5, 6, 7}. (Then P (
) = 0 for all other lists.) For this choice
of P , the list {1, 2, 3} has the highest voting proportion, but by inspection, we see
that πP ({4, 5, 6, 7}) = 8/15, which is a higher approval proportion than any other
committee.
This example shows that a most popular committee is not necessarily populated
by candidates who are most popular. Even though each of candidates 1 through 3
appears in 7/15 of the submitted lists, and candidates 4 through 7 each appear in
only 6/15 of the submitted lists, the committee {4, 5, 6, 7} is still the most popular.
n−j k
B (n − j)!k!
π(C) ≥ n = = n .
k−j j
(k − j)!n!
k j
Notice that this bound is the best possible, since it is achieved by, for example,
the uniform distribution on the set of all lists.
150 MATT DAVIS, MICHAEL E. ORRISON, AND FRANCIS EDWARD SU
.........
........ .....
... .. ......
..... .... ..........
.
.
... .. ......
.... . ... ....
... ... ... ...
.... .... ... ...
.
. .... .
. ... ....
..
. .. ... ...
.. . ... ...
..
. .. .
.. ... ...
...... ..
. ... ....
..... .
. ... ...
... .. ... ...
.. .. ... ....
.
.... .. ... .
..
... .
.
.
. ... .....
.. ... ... ...
...
...
...
.
.... ... .... ...
. .. ...
... .. ... ...
..... .
. .... ...
.. ... ..
. . ...
....................................................................................................................................................... ...
..........
............ ..
. ... .......... .....
.... ....... .. .. ...... ...
... ....... . .
. ...... ...
.... ...... .... ..
. ...... ..
........ . .........
....
.... .....................................................................................................................................................
... . .
.... .. .....
... ... . .....
.... ... ... ......
...
.... ..... ...
. .
........
.
....
.... ....
. .. .....
.. ......
.... ... ... .....
... ... ... .....
... ..
.... ... ... ..
........
.
.... ... .
... .....
.... ... ... ......
... .. .....
.... ... .. .....
.... .. ... ..........
.
....... .. ....
....... .... .........
....... .......
............
Then Br (v) is the entire Johnson graph if and only if r ≥ D. In what follows,
we assume that j > 1 so that the Johnson graph has diameter at least 2, and
neighborhoods of vertices can be proper subsets of the set of vertices.
Figure 1, though a small example, gives some helpful intuition concerning the
structure of Johnson graphs. We will examine rings of lists in J(n, j) that are
VOTING FOR COMMITTEES IN AGREEABLE SOCIETIES 151
Hence, since one of the summands must be at least as large as the average,
1
P ({1, 2, i}) + P ({1, 3, i}) + P ({2, 3, i}) ≥ (1 − p),
3
for some i with 4 ≤ i ≤ 6. Then for C = {1, 2, 3, i}, we have
1 1
(3.2) πP (C) = P (v) + P ({1, 2, i}) + P ({1, 3, i}) + P ({2, 3, i}) ≥ p + (1 − p) ≥ .
3 3
In fact, if p = 0 and P is distributed uniformly on R1 (v), then the inequality in
(3.2) is equality, and 13 is the highest approval proportion for any committee. So in
this case, with the added assumption that P is supported on B1 (v), we can improve
the lower bound on πP (Ĉ) to 1/3. (This will also be a consequence of Theorem 5.1
below.)
Rings about a vertex v always follow the same pattern: as r increases from 0 to
D, the rings increase in size monotonically for a time, then decrease monotonically
for a time. In particular, we can show:
Lemma 3.2. |Rr (v)| ≤ |Rr+1 (v)| holds if and only if
nj − j 2 − 1
r≤ .
n+2
Proof. A straightforward calculation using (3.1) shows that both conditions
are equivalent to the condition (j − r)(n − j − r) ≥ (r + 1)(r + 1).
It is helpful to visualize the Johnson graph as points on a sphere, with the list
v at the north pole, and the rings Rr (v) drawn as latitude lines. (This visualization
explains our terminology and captures the behavior of the relative sizes of the rings.)
Suppose the voter distribution P is supported on a neighborhood Bρ (v) of
radius ρ. To find a minimal guaranteed approval proportion, we are interested
152 MATT DAVIS, MICHAEL E. ORRISON, AND FRANCIS EDWARD SU
in the worst-case scenario – the distribution that leads to a worst possible “best”
committee. Intuitively, we would expect this worst-case scenario to occur when the
voters’ lists are spread out as far away from v as possible, i.e., on the ring Rρ (v).
However, if ρ is too large, the outermost ring Rρ (v) will be too small, and votes on
that ring will be too close together. In that case, the worst-case scenario will result
from a more complicated distribution of voters.
Lemma 4.1. For any voting data P , let P ◦ represent the concentric distribution
centered at v that has the same weights as P . Then if C is a committee with the
highest approval proportion πP (C), and C ◦ is a committee with the highest approval
proportion πP ◦ (C ◦ ), then
πP (C) ≥ πP ◦ (C ◦ ).
In other words, uniformizing the voter distribution concentrically over the rings
centered at v can only decrease the popularity of a most popular committee.
Proof. We will treat the weights as fixed for now, and partition the set of
committees into classes by the number of elements a committee differs from the
central list v. We’ll show that the approval proportion πP dominates πP ◦ in each
class.
VOTING FOR COMMITTEES IN AGREEABLE SOCIETIES 153
..................................................................................................................................................................................................................................................................................................................................................................................................................
.. ....
.... ................................... ...
.... ................ .......... ...
... ...
. . ........... ...... ....... ...
... . . . ...... . .. ...
...... ......
...
.
.
....
. .
...... ...
... ... ..
...... ...
....
... ...
.
. .
..... ...
... .. ........ ...
... .... .
... .......................... . ........ ..
.. ....
.. .. .
. .... .. .. .
. .
. .
... .......... ...
... .
.. ............ .......... ...... .... ........ . .
....... ...
... .... ...
...
... Cm vr ..
...
.
.. .......
.
...
.....
...
............
...
...
.... v .
..
.....
.
.....
...
...
... . .. .. .... . .... ...
... .... .... .... ....
...
... .... ...
... ... ..
. ..
. ... .
... ... ...
... ... .... .... ... ... .
... ...
... .... ... ... ... ... ... ...
...
...
...
..
...
... r ...
...
...
j−r ...
...
... .
...
..
.. m .
.
.
.
.. .
.
..
.
..
...
...
...
... ... ... ... ... ..
. .
.. ...
... ... ... . .. .
... ... ... ... ...
.. ... ...
... ... ... .... ....
.
.
. ..
.. ...
. . . ...
r−m
... ... ... ... . . .. .
.
... ... ..... ....... .... ... .. .. . ...
... ..... . ....
... ... ...... .
.
....
........... ...
.....
.. ...
... ...
........ .... ........ .. .... ...
...
...
... ............... ...................... ............... ........................... ...
k−j+m−r
... ... ....... ......... ...
... ..... ..... ...
...
.....
..... ....... ...
... ..... ......
. ...
...
......
...... . .
. .. .... ...
... ....... .. ........ ...
...
.........
............ ..... ........ ...
... ............................................ ...
... ..
...................................................................................................................................................................................................................................................................................................................................................................................................................
Let Cm denote the set of committees that differ from the list v in exactly m
candidates, i.e., m candidates are “missing”. Thus
Note that P (
) appears multiple times in this sum, once for every instance where
a committee C in Cm contains
. The number of times that happens depends on
what ring
is in;
for
in Rr (v), the number of committees in Cm that contain
is
r n−j−r
r−m k−j+m−r as calculated above. So our sum C∈Cm P (C) now becomes:
π
⎡ ⎤
D ' (' (
⎣ ⎦ r n−j−r
πP (C) = P (
)
r=m
r−m k+m−j−r
C∈Cm ∈Rr (v)
D ' (' (
r n−j−r
= wr ,
r=m
r−m k+m−j−r
154 MATT DAVIS, MICHAEL E. ORRISON, AND FRANCIS EDWARD SU
where wr is the weight of the ring Rr (v). The sum index starts at r = m because
all subsets of Cm by construction must differ from v by at least m elements, and
the index ends at D, the diameter of the graph.
Then at least one of the committees C B ∈ Cm must have approval proportion
equal to or better than average, or in other words,
D ' (' (
B 1 r n−j−r
πP (C) ≥ j n−j wr
m k+m−j r=m
r−m k+m−j−r
D
r! (n − j − r)! (j − m)! (k + m − j)!
(4.1) = wr .
r=m
(r − m)! (k + m − j − r)! j! (n − j)!
Now, consider a similar analysis using concentrically distributed voter data P ◦
centered at v with the same ring weights wr as P . Specifically, we assume that each
list in Rr (v) receives voting proportion
wr
j n−j .
j−r r
◦
k+m−j
Then, we note that a committee C in Cm will contain j−m j−r r of the lists
in the ring Rr (v). Again referring to Figure 2, we see that a list that is in Rr (v)
must contain j − r elements of v chosen from the j − m such elements in C ◦ , and
must contain r candidates not in v chosen from the k + m − j such candidates in
C ◦.
Thus, if C ◦ ∈ Cm , we have
j−mk+m−j
D
◦ j−r r
πP ◦ (C ) = wr j n−j
r=m r r
D
(j − m)! (k + m − j)! r! (n − j − r)!
(4.2) = wr .
r=m
(r − m)! (k + m − j − r)! j! (n − j)!
So every committee in Cm has the same approval proportion under P ◦ . Comparing
the final expressions in (4.1) and (4.2), we see that both are equal to
D
wr br,m
r=m
where
r! (k + m − j)! (j − m)! (n − j − r)!
br,m = .
(r − m)! (k + m − j − r)! j! (n − j)!
Thus
B ≥ πP ◦ (C ◦ ).
πP (C)
That is, the approval proportion of a most popular committee in Cm under P
will equal or exceed the approval proportion of any committee in Cm under P ◦ .
Then, taking the maximum over all m, we obtain the desired result.
The numbers br,m are integral to comparing the number of lists contained in
various committees. Most important are their relative sizes for a fixed value of r.
Lemma 4.2. We have r ≤ j 1 − j−m k+1 (using the notation of Lemma 4.1) if
and only if br,m ≥ br,m+1 .
VOTING FOR COMMITTEES IN AGREEABLE SOCIETIES 155
Proof. We start with the inequality br,m ≥ br,m+1 . After expanding the
binomial coefficients and canceling common terms, we see this is equivalent to
(j − m)(k + m + 1 − j − r) ≥ (r − m)(k + m + 1 − j)
j(k+1+m−j)
which is equivalent to r ≤ k+1 , as desired.
5. Main Theorem
We are now in a position to prove our main theorem. We now assume that
the voter distribution P is supported in some ball Bρ (v) centered at a list v, with
ρ < D so that Bρ (v) is not the entire Johnson graph.
since the weights wr are non-negative. Thus we see that Cm is more widely approved
of than Cm+1 , and C0 is always a most-approved committee.
Then we have to choose the wr to minimize the value of ρr=0 wr br,0 . But
(k − j)! (n − j − r)!
br,0 =
(k − j − r)! (n − j)!
(k − j)!
= · (n − j − r)(n − j − r − 1) . . . (k − j − r + 1).
(n − j)!
But there are always n−k terms in (n−j−r)(n−j−r−1) . . . (k−j−r+1), beginning
at n−j −r. Thus, b0,0 ≥ b1,0 ≥ . . ., and the minimum value of π(C0 ) = ρr=0 wr br,0
is achieved by letting wρ = 1, and letting every other wr be 0. In this case, the
expression in (4.2) gives the desired lower bound.
156 MATT DAVIS, MICHAEL E. ORRISON, AND FRANCIS EDWARD SU
6. Extensions
There are some natural extensions of Theorem 5.1 to other situations that may
be of interest. The first concerns what we can guarantee if only a fraction of the
voters submit votes in a neighborhood Bρ (v).
Corollary 6.1. Let ρ ≤ j 1 − k+1 j
. Suppose P is a voter distribution in
which a proportion α of the voters select lists in a ball Bρ (v), e.g., P (Bρ (v)) = α.
Then there exists a committee C with approval proportion
k−j
ρ
πP (C) ≥ n−j · α.
ρ
Proof. This follows from applying Theorem 5.1 to just those voters whose
(k−j
ρ )
votes lie in Bρ (v). We see there exists a committee that satisfies n−j of those
( ρ )
k−j
( ρ )
voters, which make up n−j · α of the entire set of voters.
( ρ )
We can also consider slightly more general election procedures. Suppose in
choosing a k-member committee from a pool of n candidates, that we allow voters
to submit a smaller unranked list between 1 and j candidates that they would prefer
to have on the committee. If these votes are sufficiently similar to v in a sense that
we make precise below, then we can obtain a result like (and because of) Theorem
5.1.
Corollary 6.2. Consider a distribution of votes over lists of size less than
or equal to j, in such a way that each
submitted list is a subset of some list in a
ball Bρ (v) of radius ρ ≤ j 1 − k+1j
in the Johnson graph. Then there exists a
k−j
( ρ )
committee that satisfies at least n−j of the voters.
( ρ )
Proof. We can convert such a distribution on subsets of size ≤ j into one on
j-sets satisfying the assumptions in Theorem 5.1, in a way that can only decrease
the number of satisfied voters. To see this, consider any voter who submits a list v
containing fewer than j candidates. We replace that list with a list v in Bρ (v) that
contains v as a subset. (Note that a voter who submitted v would be satisfied by
strictly fewer committees than a voter who submitted the list v.) Then Theorem 5.1
will apply to the resulting voting distribution, and because our alterations might
only have decreased the number of satisfied voters, the lower bound still holds.
One may also consider elections using thresholds as described in [7], and ask
whether our methods would extend in that context, in which a voter submitting
a list
approves a committee C if C contains sufficiently many members of
.
Our main result is the special case that each voter only approves committees that
contain all j of the candidates from their list. If voters approve committees that
contain at least s of the candidates from their list, for some s between 0 and j,
then the analogous result to Lemma 4.1 holds and can be proved in a similar way.
However, the analogues of the numbers br,m are significantly more complicated,
making the development of a result similar to Lemma 4.2 a barrier to proving a
generalization of the main theorem.
VOTING FOR COMMITTEES IN AGREEABLE SOCIETIES 157
References
[1] Deborah E. Berg, Serguei Norine, Francis Edward Su, Robin Thomas, and Paul Wol-
lan, Voting in agreeable societies, Amer. Math. Monthly 117 (2010), no. 1, 27–39, DOI
10.4169/000298910X474961. MR2599465 (2011i:91065)
[2] Steven J. Brams, D. Marc Kilgour, and M. Remzi Sanver. A minimax procedure for electing
committees. Public Choice, 132:401–420, 2007.
[3] A. E. Brouwer, A. M. Cohen, and A. Neumaier, Distance-regular graphs, Ergebnisse der
Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics and Related Areas (3)],
vol. 18, Springer-Verlag, Berlin, 1989. MR1002568 (90e:05001)
[4] Rosalie Carlson, Stephen Flood, Kevin O’Neill, and Francis Edward Su. A Turán-type prob-
lem for circular arc graphs. Preprint, 2010. Available at arXiv:1110.4205.
[5] Patrick Eschenfeldt. Approval voting in box societies. Harvey Mudd College Senior Thesis,
2012.
[6] Peter C. Fishburn, An analysis of simple voting systems for electing committees, SIAM J.
Appl. Math. 41 (1981), no. 3, 499–502, DOI 10.1137/0141041. MR639129 (82k:90023)
[7] Peter C. Fishburn and Aleksandar Pekeč. Approval voting for committees: Thresh-
hold approaches. Preprint, 2004. Available at http://dimacs.rutgers.edu/Workshops/
DecisionTheory2/PekecFishburn04a.pdf. Cited with permission on May 31, 2013.
[8] Sarah Fletcher. Exploring agreeability in tree societies. Harvey Mudd College Senior Thesis,
2009.
[9] William V. Gehrlein, The Condorcet criterion and committee selection, Math. Social Sci. 10
(1985), no. 3, 199–209, DOI 10.1016/0165-4896(85)90043-5. MR813247
[10] Christopher S. Hardin, Agreement in circular societies, Amer. Math. Monthly 117 (2010),
no. 1, 40–49, DOI 10.4169/000298910X474970. MR2599466 (2011i:91067)
[11] D. Marc Kilgour and Erica Marshall, Approval balloting for fixed-size committees, Electoral
systems, Stud. Choice Welf., Springer, Heidelberg, 2012, pp. 305–326, DOI 10.1007/978-3-
642-20441-8 12. MR3024666
[12] Maria Klawe, Kathryn Nyman, Jacob Scott, and Francis Edward Su. Double-interval societies.
To appear, this volume.
[13] Thomas C. Ratliff, Some startling inconsistencies when electing committees, Soc. Choice
Welf. 21 (2003), no. 3, 433–454, DOI 10.1007/s00355-003-0209-y. MR2023718 (2004i:91054)
[14] Thomas C. Ratliff. Selecting committees. Public Choice, 126:343–355, 2006.
Thomas C. Ratliff
Abstract. When selecting committees, the voters often place priority on the
overall composition of the committee in such a way that their preferences
cannot be decomposed into preferences on individual candidates. This is fre-
quently reflected in the voters’ desire that the elected committee respect some
diversity criterion, such as gender or academic rank. One aspect that makes
committee elections unwieldy is that it is usually impractical to ask voters for
a complete ranking of all possible committees. Our approach is to ask voters
for their top-ranked committee and devise weighted voting methods that re-
spect the diversity criterion. We expand previous results when the diversity
criterion consists of two categories to the analysis of three or more categories.
1. Introduction
The problem of designing elections to select a committee can raise different
issues than those of standard voting situations that produce a single winner or a
ranking of all candidates. One example of this difference is when the voters expect
the committee to contain members from several different categories. We can easily
imagine circumstances where a particular diversity criterion may be important:
The voters may want to ensure that a faculty committee contains members at the
ranks of assistant, associate, and full professor; an organization may require that
the speakers on a panel represent a variety of viewpoints; or a community may
want to select representatives from each of five precincts in the community for a
task force.
Rather than being of purely abstract interest, there are examples of such consid-
erations that have occured in practice. Ratliff [12] describes an election at Wheaton
College where the desire was to have both women and men on a search commit-
tee, yet only men were elected, and Ratliff and Saari [13] explain how elections to
one class of the US National Academy of Sciences should contain members from
four different sections. Others have explored the barriers to diverse representation
in legislatures with single-member districts, including Fréchette, Maniquet, and
Morelli [8] who discuss the impact of a gender parity law in the French Assembly.
One view of committee elections is consistent with the idea of proportional
representation as expressed by Chamberlin and Courant [6] where each voter has
a representative on the deliberative body to reflect their views and interests. A
2010 Mathematics Subject Classification. Primary 91B12, 91B10; Secondary 91A20, 91B06 .
2014
c American Mathematical Society
159
160 THOMAS C. RATLIFF
consequence of defining this strong connection between the voter and a single rep-
resentative is that it is reasonable to expect voters to express their preferences
by providing a ranking of the individual candidates. This approach is common
in the literature, including the paper by Chamberlin and Courant [6] and papers
by Gehrlein [9], Barberà, Sonnenschein, and Zhou [2] and Barberà, Massó, and
Neme [3].
There are other possible assumptions about the voters’ preferences in committee
elections. Bock, Day, and McMorris [4] propose a method that was motivated by a
consensus approach used in biological sequence analysis where each voter selects a
single candidate and the size of the committee is determined after the votes are cast.
Fishburn [7] assumes that each voter has an ordering of all possible committees
subject to certain conditions, one of which yields an asymmetric weak order on
the set of candidates for each voter. Brams, Kilgour, and Sanver [5] propose a
method where voters mark all candidates that are acceptable to them as members
of the committee, and the set of candidates that minimizes the maximum Hamming
distance to the individual voters is chosen.
Unlike these approaches, we are interested in committee elections where the
voters have a preference for the composition of the entire selected group and the
preferences may not be separable into preferences on individual candidates. A
second Wheaton election described in Ratliff [12] demonstrates that this situation
may occur in practice. The voters ranked the eight possible committees, and for
over half of the voters, their top-ranked and bottom-ranked committees had at least
one candidate in common. Thus, for at least half of the voters, their preferences are
not separable into preferences on individual candidates. Notice that non-separable
preferences can be a reflection of the voters’ desire that the committee achieve a
particular diversity criterion.
There are more details in Ratliff [12] and Ratliff and Saari [13], but we can see
how the problem of selecting a diverse committee can arise with a simple example.
Suppose an academic department is electing a committee of size k = 3 where
the candidates consist of three tenured faculty, Tanya, Terry, and Ted, and three
untenured faculty, Uri, Ursula, and Uma. We have 12 voters, each of whom selects
a diverse committee consisting of both tenured and untenured members.
Example 1.
# Voters Tanya Terry Ted Uri Ursula Uma
3 x x x
3 x x x
5 x x x
1 x x x
Totals 4 4 5 8 9 6
Despite every voter’s preference for a diverse committee, plurality selects the
homogeneous committee {Uri, Ursula, Uma}. One solution is to treat each pos-
sible committee as an indivisible unit and then apply one of the standard voting
approaches, such as the Borda count or an approval voting ballot. However, the
number of possible committees grows very large for even small committee sizes,
making it difficult for voters to provide meaningful preferences. In our simple ex-
ample, there are 18 possible diverse committees.
SELECTING DIVERSE COMMITTEES FROM MULTIPLE CATEGORIES 161
The approach in this paper is to find a middle ground that does not require
the voters to rank all possible committees, but instead only asks for their top-
ranked committee. Continuing the approach in Ratliff and Saari [13], we will
consider methods that assign weights to the individual candidates and determine
when we are guaranteed to meet the universally desired diversity criterion. Since
we are not insisting upon a full ranking of all committees, there will necessarily be
some tradeoffs. The good news is that we obtain positive results, but the required
compromises may be so restrictive when there are three or more diversity categories
as to give pause before using this approach.
The paper is organized as follows. Section 2 gives the basic framework and
previous results from Ratliff and Saari [13] when there are two candidates for
each position on the committee, each coming from one of two diversity categories.
Section 3 describes the case of two categories without candidates slotted into specific
positions, and Section 4 builds on these techniques to describe the situation of three
or more diversity categories.
2. Basic framework
For the remainder of the paper, we make the following assumptions about the
voters’ preferences and the voting method used to select the committee of size k.
• There is a universally recognized diversity criterion that every voter de-
sires to see reflected in the composition of the committee. This criterion
may divide the candidates in two categories (e.g. tenured and untenured
faculty) or three or more categories.
• Each voter specifies their top-ranked committee of size k. Since the di-
versity criterion is assumed to be universal, every voter’s top-ranked com-
mittee includes at least one candidate from each diversity category.
• We will consider voting methods that assign weights to the candidates,
and the winning committee consists of the k candidates with the largest
tally.
• We assume that our voting method is neutral with respect to the categories
and candidates. We will explain this requirement in more detail below.
The simplest method that falls within our framework is straight plurality: As-
sign one point to each candidate in a voter’s preference, and the top k candidates
form the winning committee. As we have seen in Example 1, this can lead to a ho-
mogeneous committee. It will be convenient in the exposition to denote the tenured
faculty by {t1 , t2 , t3 } and the untenured faculty by {u1 , u2 , u3 }. The election now
becomes
Example 2.
# Voters t1 t2 t3 u1 u2 u3
3 x x x
3 x x x
5 x x x
1 x x x
Totals 4 4 5 8 9 6
The goal of this paper is to categorize when we can guarantee the outcome
reflects the preference for a diverse committee by assigning values other than the
uniform weights used by plurality.
162 THOMAS C. RATLIFF
Culture (IAC), where each profile is considered to be equally likely to appear, and
the Impartial Culture (IC), where each voter is equally likely to choose each com-
mittee. By using geometric methods, Ratliff and Saari [13, Theorem 1] determines
the probability of obtaining a homogeneous outcome for k = 3. Assuming IAC, the
probability of obtaining a homogeneous outcome approaches 0.0625 as the number
of voters approaches infinity, and assuming IC, the probability of obtaining a ho-
mogeneous outcome approaches 0.0877 as the number of voters approaches infinity.
In other words, we should expect a homogeneous outcome in approximately one
out of every 16 elections assuming IAC and in approximately one out of every 12
elections assuming IC.
2.2. Ehrhart polynomials. We can repeat, and extend, the probability cal-
culations for IAC by the use of Ehrhart polynomials. See Lepelley [11] for a very
nice introduction to this theory applied to voting. We will briefly recap the major
points in the context of determining the probability of a homogeneous outcome for
k = 3 slotted elections using plurality. There are six possible diverse committees,
and we can denote the number of voters with each preference by n1 , n2 , . . ., n6 .
We can express the conditions that give the homogeneous outcome {u1 , u2 , u3 } as a
system of linear inequalities in n1 , . . . , n6 . Further, if we restrict to a fixed number
of voters, n1 + · · · + n6 = n, then the system defines a polytope in R5 . If we can de-
termine the number of integer lattice points in the polytope, then we will know the
number of elections with n voters that will give a homogeneous outcome. We can
then use the well-known fact that the number of elections with six options is given
n+5
by 5 = (n+5)(n+4)(n+3)(n+2)(n+1)
5! to calculate the probability of a homogeneous
outcome with n voters. We can then find the long range behavior by taking the
limit as n approaches infinity.
Ehrhart polynomials provide a tool to count the number of integer lattice points
in the dilation of a polytope with rational vertices, where the dilation of a polytope
P is the set nP = {nx | x ∈ P }. Thus, if we define our original polytope P by
n1 + · · · + n6 = 1, then the number of integer lattice points in nP will give the
number of elections with n voters that determine a homogeneous outcome. If the
vertices of the polytope P are integers, then there is a single Ehrhart polynomial
f associated with P . If the vertices are non-integer rationals, then we obtain a
list of polynomials g1 , . . . , gp where p is a divisor of the lcm of the denominators
of the vertices of P . In this case, the Ehrhart polynomial is a quasi-polynomial f
of degree d and period p where f (n) = gi (n) if n ≡ i mod p. In either case, f (n)
gives the number of integer lattice points in the dilation nP . An important feature
of the Ehrhart polynomial is that the leading coefficient is independent of n, and
thus is the same for all gi .
Fortunately, there are tools available to aid with the computation. The work-
flow for this paper was to use the the program lrs from Avis [1] to calculate the
vertices of the polytope P . This gave an upper bound for the period of the quasi-
polynomial. The program count from Köppe [10] has many capabilities, including
the ability to calculate the value of the Ehrhart polynomial for a specific value of n.
Since we know the degree and the period, these programs allow us to find enough
specific values so that we can then determine the Ehrhart polynomial through in-
terpolation. Using these techniques, we found that the Ehrhart polynomial for the
164 THOMAS C. RATLIFF
We will show that plurality can give a homogeneous committee; we will show that
the ability to guarantee a diverse outcome is sensitive to the number of candidates
in each category; we will show when we can assign weights to guarantee a diverse
outcome; and we will calculate the probability of a homogeneous outcome using
plurality. The last goal is somewhat limited by computational restrictions, but we
will also develop a geometric framework to give some intuition for what we should
expect when the direct probability calculations are not feasible.
Assume that the diversity criterion divides the candidates into two categories,
{t1 , t2 , . . . , tr } and {u1 , u2 , . . . , us }. Notice that we allow r
= s so there may be
a different number of candidates in each group. As before, we are selecting a
committee of size k, and we assume that every voter specifies a diverse committee
on their ballot. Also note that r + s > k or else the elections is impossible (r + s <
k) or trivial (r + s = k). We also assume that r > 1. Otherwise, any diverse
committee must contain t1 , and the election reduces to selecting k − 1 candidates
from {u1 , u2 , . . . , us } with no diversity criterion. Similarly, we assume s > 1.
Our analysis breaks down into several cases as determined by the regions shown
in Figure 1.
S r=k
II IV
s=k
III
I II
s=1
r
r=1 r+s=k
Figure 1. The fundamental regions for the two category case with
candidates {t1 , t2 , . . . , tr } and {u1 , u2 , . . . , us }. The arrows indi-
cate to which region the boundaries belong.
3.2. Case II: r < k and s ≥ k (or s < k and r ≥ k). Using a similar
construction to Example 2, we can show that it is possible for plurality to give
a homogeneous committee. We will look at a specific case before formulating the
general construction.
Suppose we are selecting a committee of size k = 3 with candidates {t1 , t2 } and
{u1 , u2 , u3 }. Consider the following election:
Example 4.
# Voters t1 t2 u1 u2 u3
1 x x x
1 x x x
1 x x x
1 x x x
1 x x x
1 x x x
Totals 3 3 4 4 4
We are essentially equally distributing one point between t1 and t2 , and equally
distributing 2 points among u1 , u2 , and u3 . Therefore, with six voters we can
achieve a homogeneous outcome using plurality.
The same process can work for larger values of k: We equally distribute one
point among t1 , t2 , . . . , tr and k−1 points among u1 , u2 , . . . , uk using n = lcm(r, k)
voters. Then each candidate ti will receive nr points and each ui will receive (k−1)n k
points. Since r ≥ 2 and k ≥ 3, we have 1r < k−1 k , and the outcome will be the
homogeneous committee {u1 , · · · , uk }.
Also note that the same construction will give a homogeneous outcome when
using weights other than plurality as long as the total weight W1 assigned to the
t’s is less than the total weight W2 assigned to the u’s. If W2 > W1 , then the
neutrality assumption allows us to switch the assigned weights so that the t’s have
total weight W2 and the u’s have total weight W1 , and 9we: can then obtain a
homogeneous outcome. The one exception occurs if r < k2 , in which case we
cannot apply the neutrality condition. For example, if k = 5, r = 2, s = 5 and the
method assigns weights (5, 5, 1, 1, 1) to (t1 , t2 , u1 , u2 , u3 ), then we cannot swap the
weights assigned to the t’s and u’s since there are only two t’s. In this exceptional
case, we will always obtain a diverse outcome although the weights assigned to the
two categories are not equal.
We now argue that assigning equal weights to each category will ensure a diverse
outcome. First notice that it is impossible to obtain a homogenous committee
consisting of candidates from only {t1 , . . . , tr } since r < k. The argument that
we will obtain a diverse committee is deceptively simple, based only on averaging.
Suppose that the total weight assigned to each category is W and that we have
n voters. Then the average tally for the candidates {t1 , t2 , . . . , tr } is Wrn , and
the maximum average for any set of k candidates from {u1 , u2 , . . . , us } is Wkn .
Since r < k, the average tally for the t’s is greater than the average tally for any
subset of k candidates from the u’s. Therefore, the individual tallies of any subset
of k candidates from the u’s cannot be greater than the tally of every candidate
t1 , . . . , tr . Further, notice that at least one t must have a higher tally than at least
one element of any subset of k candidates from the u’s. Thus, the committee must
contain at least one candidate from {t1 , . . . , tr }.
SELECTING DIVERSE COMMITTEES FROM MULTIPLE CATEGORIES 167
Applying this method to Example 4 with a total weight of 2 for each category,
our tallies become:
Example 5.
# Voters t1 t2 u1 u2 u3
1 2 1 1
1 2 1 1
1 2 1 1
1 2 1 1
1 2 1 1
1 2 1 1
Totals 6 6 4 4 4
Thus, the winning committee will include t1 , t2 , and one of the candidates
u1 , u2 , or u3 depending upon the method chosen for breaking ties, guaranteeing
a diverse committee independent of the tiebreaker chosen. One may object to the
winning committee consisting of two candidates who appeared on three voters’
ballots while excluding two candidates who appear on four voters’ ballots, but the
outcome does meet the prime directive of producing a diverse committee.
3.3. Case III: r = k and s = k. Notice that all previous examples in Ratliff
and Saari [13] showing that it is possible to obtain a homogeneous committee apply.
That is, we have only removed the requirement that the candidates {t1 , . . . , tk }
and {u1 , . . . , uk } are slotted against each other in head-to-head elections. Thus,
the results of Theorem 1 showing that unequal weights can give a homogenous
committee hold for this unslotted situation. However, we need a different argument
that assigning equal weights to both categories will guarantee a diverse outcome
since there are some preferences valid in our current scenario that are not valid in
slotted elections, such as {t1 , u1 , u2 , . . . , uk−1 }.
Fortunately, the same averaging argument used above immediately shows that
when assigning equal weights to both categories, it is impossible for all k candidates
in one category to have a larger tally than every candidate in the other category.
3.4. Case IV: r > k and s ≥ k (or s > k and r ≥ k). Every homogeneous
outcome that is possible in Case III is also possible in this situation since a ballot
from Case III can be padded with dummy candidates to make it an election from
this case. Unfortunately, we can get a homogeneous outcome even when using the
same weights for both categories.
We will consider a specific example to illustrate the argument that also holds
in the general case. Let k = 3, r = 4 and s = 3 so that we are electing a committee
of size 3 from candidates {t1 , t2 , t3 , t4 } and {u1 , u2 , u3 }. Consider the following
election where every voter selects a diverse committee, and we use the total weight
of 2 for each category.
168 THOMAS C. RATLIFF
Example 6.
# Voters t1 t2 t3 t4 u1 u2 u3
1 1 1 2
1 1 1 2
1 1 1 2
1 1 1 2
1 1 1 2
1 1 1 2
Totals 3 3 3 3 4 4 4
Notice this block assigns a total of W to every candidate except for u1 , which
receives 2W . We can mimic this construction for a total of k blocks where the ith
block assigns a weight of W to every candidate except for ui , which receives 2W .
The cumulative effect will be an election that gives a total of k · W to t1 , . . ., tk+1
and (k + 1) · W to u1 , . . ., uk , resulting in the selection of a homogeneous committee
{u1 , . . . , uk }.
Therefore, we have proved the following:
these two categories. To avoid trivial or empty scenarios, assume that r > 1, s > 1,
and r + s > k.
To reflect the diversity criterion, assume that an admissible ballot must have
at least one candidate from each category. Further, assume that the voting method
assigns weights to the candidates and is neutral with respect to the categories and
candidates.
(1) If r < k, s < k, and r + s > k, then any method will meet the diversity
criterion.
(2) If r < k and s ≥ k (or s < k and r ≥ k), then the diversity criterion
will always be met if the sum9 of
: weights assigned to the candidates of each
category are equal. If r ≥ k2 , then the statement becomes “if and only
if ”.
(3) If r = k and s = k, then the diversity criterion will always be met if and
only if the sum of weights assigned to the candidates of each category are
equal.
(4) If r > k and s ≥ k (or s > k and r ≥ k), then it is impossible to guarantee
that the diversity criterion will always be met.
436480777
· 17! · 2 ≈ 0.03273
9486781262199521280000000
170 THOMAS C. RATLIFF
Notice this is less than the probability of 0.0625 for the slotted case that we calcu-
lated in Section 2.2. We will give some intuition why the value is less in the next
section.
For k = r = s = 4, there are 68 diverse committees in the unslotted case,
rather than 14 in slotted elections. The polytope for the Ehrhart polynomial for
{u1 , u2 , u3 , u4 } will lie in R67 , which creates a problem that is much too computa-
tionally intensive to complete in a reasonable amount of time. However, we expect
the value to be less than 0.05176 found for the slotted case found in Section 2.2 for
reasons explained in the next section. Larger values of k, or examples from Case II
or IV, are also too computationally intensive to compute using the current tools.
Therefore, plurality can give a homogeneous outcome because the convex hull
determined by {p1 , p2 , . . . , p18 } intersects the regions corresponding to homogeneous
outcomes p19 and p20 . However, by using the same weights for each category, we
are changing the points that determine the convex hull. For example, rather than
(1, 1, 0, 0, 0, 1) we would use (1, 1, 0, 0, 0, 2). This deformation adjusts the convex
hull so that it no longer intersects the homogeneous regions corresponding to p19
and p20 , guaranteeing that the outcome is a diverse committee.
This interpretation can also give some insight into the decrease in probability
in obtaining a homogeneous outcome when using plurality and moving from the
slotted to the unslotted elections for k = 3 and three candidates of each category.
With the slotted election, there are six diverse committees and two homogeneous
committees. With the unslotted elections, there are 18 diverse committees and still
only two homogeneous committees. This suggests that the 12 additional diverse
committees in the unslotted case expand the convex hull to have a larger portion
outside of the homogenous regions.
This can give us some insight into what we should expect for larger values of k as
well. With k = 4 and four candidates in each category, the slotted case will have 14
diverse committees, two homogeneous committees, and the convex hull will lie in R8 .
Compared to the k = 3 case, it is therefore not surprising that the probability of a
homogeneous outcome using plurality decreases from 0.0625 to 0.05176. Further, in
the unslotted case, there are 84 = 70 possible committees. By the same argument
as above, we should expect that the probability of a homogeneous outcome using
plurality is lower in the unslotted case than in the slotted case.
Although the direct calculations are too computationally intensive to complete
at this time, we should anticipate similar results to hold for larger values of k.
That is, the probability of obtaining a homogeneous outcome using plurality will
be greater for the slotted case than in the unslotted case for any value of k, and
the probabilities will decrease as k increases. Specifically, we should expect the
probability of a homogeneous outcome using plurality for k = 6 with six candidates
in each category to be 5% or less for either the slotted or unslotted case. This
diminishing probability could play a role in determining whether or not one chooses
to implement this approach.
This framework can also be applied if r
= s, but there will be more homoge-
neous committees that will not be as symmetrically located in the positive orthant
as in the r = s case. This makes the comparison for different values of r, s and k
somewhat more subtle and less fruitful.
Example 8.
# Voters a1 a2 b1 b2 c1 c2
1 x x x x
1 x x x x
1 x x x x
1 x x x x
Totals 2 2 3 3 3 3
The problem is the same as the one we have seen before. By assigning unequal
weights to the categories, we are able to limit the weights in one category while
favoring the candidates in other categories in order to exclude all candidates from
the first category.
For N = 3 and k > 4, the trivial
9 : solution is to restrict the number of candidates
in each category to fewer than k2 candidates so that it is impossible to select k
candidates without including at least one candidate from each category. This is
analogous to the triangular region in Figure 1 below the line r + s = k. However,
this is a very unsatisfactory solution because this severely limits the total number
of candidates.9 For
: example, an election to select a committee of size k = 7 would
have at most 72 − 1 = 3 candidates per category, giving nine candidates in total.
This effectively changes the election into selecting a few candidates to exclude
rather
9 k : than a choice of which to include. Notice we exclude k = 3 or k = 4 since
2 − 1 = 1 and each category would have only a single candidate, giving three
candidates total. To see the limitations of this restriction, consider the following
table.
Example 9.
9k: Total # Total #
k 2 −1 Candidates Excluded
5 2 6 1
6 2 6 0
7 3 9 2
8 3 9 1
10 4 12 2
9 :
If each category has exactly k2 candidates, then a construction similar to
Example 8 for k = 4 shows that plurality can exclude the first category from
the outcome. We can easily extend this to any method that does not assign equal
weights to all three categories. Suppose we have a ballot that assigns a total weights
of W1 ≤ W2 ≤ W3 to the three categories, where at least one is strict inequality.
We then9 create
: an election similar to our example in Case IV in Section 3 where
we use k2 voters to evenly distribute the total weights of W1 , W2 , and W3 among
the candidates
9 : in Category 1, Category 2, and Category 3, respectively, and then
use k2 voters to evenly distribute the total weights of W1 , W3 , and W2 among
the candidates in Category 9 :1, Category 2, and Category 3, respectively. This will
give us an election of 2 k2 voters where each candidate in Category 1 receives
2W1 points, and each voter in Categories 2 and 3 receives W2 + W3 points. Since
W1 ≤ W2 and W1 < W3 , this ensures that 2W1 < W2 + W3 and no candidate from
Category 1 will be included in the committee.
A nearly identical averaging argument as before also shows 9 :that if we use the
same total weight for each category and each category has k2 candidates, then
SELECTING DIVERSE COMMITTEES FROM MULTIPLE CATEGORIES 173
These
; same
< arguments generalize to
; N > < 3 categories with the dividing line
being Nk−1 . If there are fewer than Nk−1 candidates per category, then it is
impossible to select k candidates
; without
< including at least one candidate from each
category. If there are exactly Nk−1 candidates per category, then any method that
does not assign the same total weight to every category can give an outcome that
excludes one category using a similar construction as above. Using the same total
weights per category in this case must include a candidate from every category by
following
; < the same averaging argument: To exclude Category 1 from the output, all
k
N −1 candidates in another category must have a larger tally than every candidate
in Category 1, but this is impossible since both categories have ;the same
< number
k
of candidates and the same average tally. If there are more than N −1 candidates
174 THOMAS C. RATLIFF
per category, then we can mimic the constructions used before to show that any
method can give a homogeneous outcome.
While the second part of the theorem does provide methods for guaranteeing
that the selected committee will include candidates from every category, in prac-
tice this is unlikely to be implemented. The criticism from Section 4.1, while less
pronounced than in the N = 3 category case, is still evident with N > 3 categories.
Specifically, more candidates on the ballot are included in the winning committee
than are excluded. For 9 example
: with N = 4 categories, selecting a committee of
size k = 9 would have 93 = 3 candidates per category,
9 : or 12 candidates total, and
selecting a committee of size k = 10 would have 10 3 = 4 candidates per category,
or 16 candidates total.
Also note that while it is theoretically feasible to use Ehrhart polynomials
to calculate the probability of a homogeneous outcome using plurality, the large
number of possible committees places the corresponding polytope in a very high
dimensional space, making the required calculations infeasible with current tools.
5. Conclusion
Our primary goal for this paper was to develop methods for electing a commit-
tee of size k that respect a universal diversity criterion shared by the electorate. In
building off previous work in Ratliff and Saari [13], we restricted to methods that
assign points to the individual candidates and only require the voters to indicate
their top-ranked committee. When there are two categories determined by the di-
versity criterion, Theorem 2 provides us with some good news: We create a ballot
with k candidates from each category, and we are guaranteed a diverse outcome
by assigning the same total weight to the candidates from each category on each
voter’s ballot. Furthermore, this can be applied even if the diversity criterion is
not determined before the election: Create a ballot with 2k candidates, and ask
each voter to divide their top-ranked committee into two categories representing
their personal diversity criterion and assign an equal total weight to each set. If the
SELECTING DIVERSE COMMITTEES FROM MULTIPLE CATEGORIES 175
preferences of all voters divide the candidates into two disjoint sets, then Theorem 2
guarantees that the outcome will respect this diversity criterion.
However, our results in Theorems 3 and 4 are not nearly so satisfying for
three or more categories. The only way to guarantee a diverse outcome within our
framework is to limit the number of candidates so severely that it would almost
certainly be unacceptable in practice. This suggests that any acceptable method
that respects a diversity criterion with three or more categories would require more
extensive preferences from the voters and a more elaborate voting method.
References
[1] D. Avis, lrslib Ver 4.2, lrs home page, <http://cgm.cs.mcgill.ca/∼avis/C/lrs.html>, accessed
Nov 19, 2012.
[2] Salvador Barberà, Hugo Sonnenschein, and Lin Zhou, Voting by committees, Econometrica
59 (1991), no. 3, 595–609, DOI 10.2307/2938220. MR1106507 (92b:90060)
[3] Salvador Barberà, Jordi Massó, and Alejandro Neme, Voting by committees under constraints,
J. Econom. Theory 122 (2005), no. 2, 185–205, DOI 10.1016/j.jet.2004.05.006. MR2141988
(2006b:91050)
[4] Hans-Hermann Bock, William H. E. Day, and F. R. McMorris, Consensus rules for committee
elections, Math. Social Sci. 35 (1998), no. 3, 219–232, DOI 10.1016/S0165-4896(97)00033-4.
MR1618447 (99c:90033)
[5] S.J. Brams, D.M. Kilgour, M.R. Sanver, A minimax procedure for electing committees, Public
Choice 132 (2007) 401–420.
[6] J.R. Chamberlin, P.N. Courant, Representative deliberations and representative decisions:
Proportional representation and the Borda rule, American Political Science Review 77 (1983)
718–733.
[7] Peter C. Fishburn, An analysis of simple voting systems for electing committees, SIAM J.
Appl. Math. 41 (1981), no. 3, 499–502, DOI 10.1137/0141041. MR639129 (82k:90023)
[8] G. Fréchette, F. Maniquet, M. Morelli, Incumbents’ interests and gender quotas, American
Journal of Political Science 52 (2008) 891–909.
[9] William V. Gehrlein, The Condorcet criterion and committee selection, Math. Social Sci. 10
(1985), no. 3, 199–209, DOI 10.1016/0165-4896(85)90043-5. MR813247
[10] M. Köppe, LattE macchiato Ver 1.2, LattE macchiato,
<https://www.math.ucdavis.edu/∼mkoeppe/latte/>, accessed Nov 19, 2012.
[11] Dominique Lepelley, Ahmed Louichi, and Hatem Smaoui, On Ehrhart polynomials and prob-
ability calculations in voting theory, Soc. Choice Welf. 30 (2008), no. 3, 363–383, DOI
10.1007/s00355-007-0236-1. MR2377347 (2008m:91065)
[12] T.C. Ratliff, Selecting committees, Public Choice 126 (2006) 343–355.
[13] T. C. Ratliff, D. G. Saari, Complexities of electing diverse committees, Social Choice and
Welfare (2013) doi:10.1007/s00355-013-0773-8.
Brian Hopkins
1. Introduction
The most fundamental concept in noncooperative game theory is the 2×2 game,
where two decision makers each have two choices and the relative payoffs for each
player over the four possible outcomes are given in a bimatrix. Prisoners’ Dilemma
and Chicken are well-known examples, but there are many others, including games
that are asymmetric and have ties among some of the players’ preferences.
We restrict our attention to 2 × 2 games with ordinal ranks (which does not
change analysis of dominant strategies or Nash equilibria, for instance). In §2, we
further require that each player have distinct preferences over the four possible
outcomes. Robinson and Goforth counted 144 such games and formalized a system
of connecting them [7] which can be described as using the “neighborly transpo-
sitions” (12), (23), (34) from the symmetric group S4 . This connecting system is
simpler and more uniform than related earlier attempts, which are more properly
taxonomies than networks. We summarize the Robinson-Goforth system, consid-
ering it a graph with edges determined by the exchanges, revisiting enumeration
results, and offer a new graph theoretic analysis.
In §3, we allow players to have ties in their ranking of possible outcomes.
There are many more such games, and it would be helpful to integrate them into
the existing system of games with distinct ranks. Games with a single tie for
one player can identified with edges in the Robinson-Goforth graph, but attempts
to incorporate other games into the graph have fallen short [4, 8]. We sketch a
collection of simplices to hold all 1,413 ordinal rank 2 × 2 games and illustrate
finding the games adjacent or incident to an example.
Although this article is primarily motivated by mathematical structures, we do
mention game theoretic motivations throughout.
2014
c American Mathematical Society
177
178 BRIAN HOPKINS
(1, 4) (3, 3)
(2, 1) (4, 2)
2.2. Connections. One of the challenges in using game theory to model the
interaction of decision makers is determining the ranks of the various possible out-
comes. The best and worst possibilities may be clear enough, but it may be unclear
how to rank the middle two. With this in mind, Robinson and Goforth connect
games by neighborly transpositions that swap adjacent rankings. In particular,
there are six exchanges, three per player: R12 swaps the 1 and 2 in the row player’s
ranking, similarly R23 , R34 , C12 , C23 , and C34 . For instance, the game from Figure
1 and its image under the C12 exchange are shown in Figure 2.
With these exchanges, each game is connected to six others. In other language,
let the Robinson-Goforth graph have the 144 games as vertices connected by edges
determined by the six transpositions. This gives a 6-regular graph; the neighbors
of Prisoners’ Dilemma are shown in Figure 3. (Note that regularity would be lost if
asymmetric pairs of game were not considered separately; symmetric games would
only have three neighbors.)
The graph has (144 · 6)/2 = 432 edges. Analysis of the adjacency matrix shows
that the graph has diameter 6.
As suggested by the game numbering in Figure 3, the four layers of 36 games are
generated by application of the exchanges R12 , R23 , C12 , C23 . Although presented
as 6 by 6 tables, the operations “wrap around the edges” such that each layer is a
torus. The inclusion of the R34 and C34 exchanges greatly complicates the picture;
Robinson and Goforth determine that the graph has genus 37 [7, §§6.3, 7.2]. Other
structures of interest arise from the application of the exchanges R12 , R34 , C12 , C34
180 BRIAN HOPKINS
which include 8 or 16 games and are called, respectively, hotspots and pipes. A
hotspot connects sets of four games from two layers, while a pipe connects sets of
four games from all four layers.
Earlier structures on 2 × 2 distinct ordinal rank games were based on game
theoretic characteristics. Rapoport, Guyer, and Gordon developed a system of
phylum, class, order and genus to organize their 78 games [6]. The genera vary in
size from 1 to 13 games, and the system is reminiscent of the phenotypic taxonomy
in biology predating the discovery of DNA.
For the most part, practitioners have proceeded as if “each game is an island,
studied independently.” Robinson and Goforth claim instead, “There is no island
of California and there is no island of Prisoner’s Dilemma.” [2, p.603]. They spend
considerable time on the consistency of various game theoretic characteristics with
their more uniform system [7]. The notion of distance between games is helpful in
practice since many times close games share certain characteristics.
For instance, each of the 24 type A games has 16 games distance 2 from it,
32 games distance 3 away, 43 games 4 away, 34 games 5 away, and 12 games 6
away. Weighting the number of games by their distance gives an average distance
of 548/144 to other games. Each 7-tuple of neighborhood sizes sums to 144. We
can elaborate on the graph diameter statistic to note that each game is distance 6
from 6, 8, 9, or 12 other games.
There is additional structure within these five types. For instance, each type
A game has four type B and 2 type C games in its first neighborhood, two type A
games and 12 type D games in its second neighborhood, etc. This sort of consistency
appears for all types at every distance.
These five types of games fit well with the Robinson-Goforth structures hotspots
and pipes. In particular, each hotspot consists of four type A and four type B games.
The higher average distance for these games is consistent with the fact that each
hotspot connects just two of the four layers. Each pipe consists of 4 type C, 8 type
D, and 4 type E games. The lower average distance is consistent with each pipe
connects games in all four layers. The consistency of these graph theoretic game
types with hotspots and pipes suggests that these two structures (as opposed to
layers) merit additional attention.
EXPANDING THE ROBINSON-GOFORTH SYSTEM 181
where ∼ denotes ordinal equivalence. However, it is helpful to use (1.5, 1.5) for
the case α = 12 , the tie halfway along the exchange. The discrete alternative is to
consider just these half-exchanges, restricting to α = 12 , which we will denote R12 ,
etc.
y-coordinate 4 does not change, as it is not effected by S12 . Similarly, the point
(4, 1) transitions through (4, 1.5) before reaching (4, 2). The point (2, 2) transitions
through (1.5, 1.5) before reaching (1, 1), while the point (3, 3) is unchanged by S12 .
This S12 game helps one understand the change between the very different Nash
equilibria characteristics of the two well-known games. The point (2 − α, 2 − α)
is a Nash equilibrium for α ≤ 12 , a range that includes Prisoners’ Dilemma and
the borderline game. The points (1 + α, 4) and (4, 1 + α) are Nash equilibria for
α ≥ 12 , a range that includes Chicken and the inbetween game. In this S12 game,
there are three Nash equilibria. At this value α = 12 , both players are indifferent
between certain choices, corresponding to horizontal and vertical line segments in
the diagram. Thinking of α moving from 0 to 1, the inbetween game includes the
Pareto non-optimal Nash equilibrium “on its way out” and the two Nash equilibria
from Chicken “just starting.”
To include all symmetric games in a single structure, Robinson and Goforth
place these games on a polyhedron, a “winged octagon” with two flaps attached to
antipodal edges; see Figure 5. Each of 12 triangular faces corresponds to a game
with distinct rankings, and adjacent games share an edge. Each of the (12·3)/2 = 18
edges corresponds to a game where each player has one tie. Thus each distinct
rank game is adjacent to three games with a single tie for each player. The games
with two ties for each player correspond to the vertices; for instance, S12 ◦ S23
leads to rankings (1, 1, 1, 2) for both players (which might be considered (2, 2, 2, 4)
instead to preserve the rank sum) and this corresponds to the vertex where the
lines corresponding to S12 and S23 intersect. Each distinct rank game is adjacent
to three games with two ties for each player, matching the number of ways to choose
two out of three half-exchanges. Each vertex is incident to four or six faces (the
six face cases occur where the flaps attach to the octagon). While the geometric
figure has eight vertices, the two wingtips are identified, making that single vertex
adjacent to both pairs of games on the flaps. See [2] for more details.
It is reasonable to have a three dimensional model for symmetric games. To
build a symmetric game bimatrix, there are three degrees of freedom: choose one
EXPANDING THE ROBINSON-GOFORTH SYSTEM 183
................................................................................ ..................................................................................
............. . . .. ............. .. . ..
....... .... ......... ..... .. ............ ....... .... ........ ..... ... .............
....... ... ..... ..... .. ....... ....... .. ....... ..... ....
.........
. . ..... ..... ....
. ....... .
........ ... ..... .....
.......
.......
. .
....... .
... ..... .
.....
. .
. ....... .
.......
..
. ..... .
...... ... .......
Z ........... . ... ..
.....
. ..
......
. .
.
.
.
...
.......
. Y ...
.......
.. .
.
. .....
. ..
......
. ..... .......
....... Z
.
. . .
..... .... . . ..
....... ...... .. . .
..... .... .
....
...... ... .. .
.
........ .
.
. ... ..
.
.......... .
.
. .. .
.
........ ...
...
.......
...
.......
....... Z1 .... ..
.. ..
. Ch .
.
. Y1 ..
...
. ...
....... Y2 .
.
. ..
.. ..
. . Z2 .......
....... .. .
.... ....... .
. ..
..... . .
. .
.... ...... ... .......
....... .
.
. ..
......
. .....
..
.
.
. ..
...
.......
. ..
.......
...
.
.
.
. . ..
...
. .
. ......
..
...
.....
. ......
.
....... . ... ..... . .... ....... .. ..... . ....
....... .... ...
. ..
.
.
. .......
. ...
.......
.
.
. ..... .
.
...
...
.... .
....... ... ..... ..
....... .. ..... ..... ..
.......
....... .... .........
....... ... ....
PD ..... ..
... . ..........
.......
....... .... ......... ..... ...
..... .. ............
.................................................................................................. ..............
.....................................................................................
.
rank for a one matching possibility (say the 2’s in the normal form of Prisoners’
Dilemma, shown on the upper right of Figure 2), choose one remaining rank for the
other matching possibility (the 3’s in PD), and then the order of the two other ranks
in a remaining possibility (the (1, 4) in the upper right) — the game is completely
determined at this point since the remaining possibility is forced.
3.2. All games in five dimensions. To consider asymmetric games with
ties, we return to the six operations which exchange two subsequent ranks for a
single player. Figure 5 shows the nine games arising from R12 (applied across rows,
where x-values change) and C12 (columns, y-values) on Prisoners’ Dilemma and
Chicken. To describe Figure 5 more succinctly, we abbreviate Prisoners’ Dilemma
by PD and Chicken by Ch. The four corner games have distinct rankings for
both players and their Robinson-Goforth numbering is given. The horizontal bar
includes the C12 games where the column player is indifferent between the two
lowest options. The vertical bar includes the analogous games for R12 . The game
in the middle has two ties, one per player, and can be described as (R12 ◦ C12 )(PD)
or (R12 ◦ C12 )(Ch). Figure 4 consists of the lower left, middle, and upper right
games of Figure 5.
in Figure 5 commute, i.e., (R12 ◦ C12 )(PD) = Ch =
The two operations used
(C12 ◦ R12 )(PD). Of the 62 = 15 possible pairs of operators, eleven commute: nine
of the form Rwx , Cyz , and also R12 , R34 and C12 , C34 . Each such pair will generate
nine games as in Figure 5; compare this to the degree 4 vertices in the “winged
octahedron” described in §3.1. In the remaining four pairs, the two operations
satisfy a braid relation, e.g., R12 ◦ R23 ◦ R12 = R23 ◦ R12 ◦ R23 . Compare the degree
6 vertices in §3.1.
To include all games in a single structure, one might begin by embedding the
Robinson-Goforth graph on a surface and create a map by expanding the vertices to
regions; see [4, 8] for work in this direction. Edges between vertices become border
edges between regions, and there are now vertices available to extend the model.
This is suggested by Figure 5, where the gray edges are associated with games
where one player has one tie in ranks, and the intersection of the gray edges is
associated to a game with two ties. Since each distinct rank game has six neighbors,
the regions are hexagons. The graph has 432 edges, which exactly accommodates
the games with a single tie (all enumeration results for games with ties match [1]
when adjusted to not allow the player reversal symmetry). However, there are not
184 BRIAN HOPKINS
enough vertices for games with two ties: A distinct rank game should be adjacent
to 15 games with two ties corresponding to possible pairs of the six exchanges, but
a hexagon is bounded by six vertices. Moreover, there are no additional structures
in the map to handle games with more than two ties.
Upon reflection, a three dimensional model does not have enough degrees of
freedom for this goal. To build an asymmetric game bimatrix, place the row player’s
lowest rank possibility in the upper left, then there are two choices for where to
place other ranks before the last is forced. There are three choices to be made for
placing the column player’s ranks before the last is forced. This suggests a total of
five degrees of freedom.
A mathematical candidate that accommodates all ordinal rank 2 × 2 games
with all appropriate adjacencies is a collection of 5-dimensional simplices. Each
EXPANDING THE ROBINSON-GOFORTH SYSTEM 185
This game G with three ties corresponds to a triangle in the geometric model; to
which 5-simplices is it incident? Consider starting from a distinct rank game to
which ties are made to arrive at G. Since the row player ranks in G are (1, 1, 1, 2)
(or (2, 2, 2, 4)), the operations R12 and R23 must be applied in some order to the
distinct rank game, leaving a single higher rank for the row player in G. Simiarly,
since the column player ranks are (1, 1, 2, 3), the operation C12 must be applied.
To find a distinct rank game that reduces to G, we “untie” the ties in any order,
such as given in Table 3. (Notice that both ordinal sum representations, on the
top row, and constant rank sum representations, on the bottom row, have a similar
disadvantage: The change in values from breaking a tie can change the values of
other options. E.g., in the top row, the reverse image of R23 takes the row player
values from (1, 1, 1, 2) to (1, 1, 2, 3), changing the value of the most preferred object
from 2 to 3, even though it was not part of the tie. Similarly, in the bottom row,
the values for the row player change from (2, 2, 2, 4) to (1.5, 1.5, 3, 4).)
Thus the game at the right hand side of Table 3, game 121 in the Robinson-
Goforth numbering and the lower right game of Figure 5, reduces to G under
186 BRIAN HOPKINS
R12 , R23 , and C12 . To determine the other distinct rank games that reduce to G
under these three half-exchanges, apply the corresponding maps R12 , R23 , C12 in all
possible combinations to game 121. This orbit gives the twelve games 1b1 and 1b2
for 1 ≤ b ≤ 6, two columns of Robinson and Goforth’s layer 1. One can interpret
this as follows: Exactly these twelve games among the distinct rank games share
certain characteristics with the game G.
Similar reasoning determines the number of games, partitioned by ties, that
either reduce to our reduce from G, shown in Table 4. Geometrically, this means
the triangle corresponding to G is incident to 12 5-simplices, 18 4-simplices, 8
tetrahedra, and, as one would hope, contains 3 edges and 3 vertices.
An initial step to providing game theoretic structure for these gradual transfor-
mations is the refinement of exchanges to half-exchanges or continuous exchanges.
The full array of 2 × 2 ordinal rank games, symmetric and asymmetric, with and
without ties, are then connected. The simplicial model proposed here could provide
a structure for better understanding these connections.
Acknowledgments
David Goforth and David Robinson organized a 2 × 2 games working group at
the Canadian Economics Association 2011 meeting, where an early version of this
work was presented [4]. Saint Peter’s University and its Honors Program supported
the undergraduate thesis work of Sarah Heilig [3]. Tomoko Matsumoto of the New
York University Masters of Politics program suggested [9]. Michael Jones and
several others have coordinated several special sessions at the Joint Mathematics
Meeting, including 2012 where a more recent version of this work was presented.
References
[1] Niall M. Fraser and D. Marc Kilgour. Non-strict ordinal 2 × 2 games: A comprehensive
computer-assisted analysis of the 726 possibilities, Theory and Decision 20 (1986), no. 2,
99–121, DOI 10.1007/BF00135087. MR838066 (87e:90142)
[2] David Goforth and David Robinson, Effective choice in all the symmetric 2×2 games, Synthese
187 (2012), no. 2, 579–605, DOI 10.1007/s11229-010-9862-8. MR2959827
[3] Sarah Heilig. When prisoners enter battle: Natural connections in 2 × 2 symmetric
games, Saint Peter’s College Honors Thesis, advisor Brian Hopkins, 2011, available at
http://librarydb.saintpeters.edu:8080/jspui/handle/123456789/14.
[4] Brian Hopkins. Between neighboring strict ordinal games, white paper from the 2011 Canadian
Economics Association meeting, available at http://economics.ca/2011/papers/HB0020-1.pdf.
[5] D. Marc Kilgour and Niall M. Fraser. A taxonomy of all ordinal 2 × 2 games, Theory and
Decision 24 (1988), no. 2, 99–117, DOI 10.1007/BF00132457. MR931042 (89f:90199)
[6] Anatol Rapoport, Melvin Guyer, and David Gordon. The 2 × 2 Game, University of Michigan,
Ann Arbor MI, 1976.
[7] David Robinson and David Goforth, The Topology of the 2 × 2 Games: A New Periodic Table,
Routledge Advances in Game Theory, Routledge/Taylor & Francis Group, New York, 2005.
MR2547198 (2011c:91003)
[8] David Robinson, David Goforth, and Matt Cargill. Toward a topological treatment of the
non-strictly ordered 2 × 2 games, 2007 working paper, available at
http://142.51.79.168/NR/rdonlyres/71D1CD44-AADC-4A21-94A8-E8DC543AC7B1/0/
Nonstrict.pdf.
[9] Wolfgang Streek and Kathleen Thelen. Beyond Continuity: Institutional Change in Advanced
Political Economies, Oxford University, 2003.
1. Introduction
When modeling events from the social sciences, it could be that no player wants
to suffer the consequences of a specified Nash equilibrium. An illustration would be
competing countries where the strategic structure leads to the dominant strategy
outcome of a war. Fortunately, and as it is well-known, there are equilibria in a
repeated setting that differ from the single-shot Nash equilibria. This fact under-
scores an advantage of repeated games (i.e., games involving repeated interactions)
should settings exist where an appropriate strategy can lead to cooperation.
A natural illustration is the following Prisoner’s Dilemma game G1 ,
4 4 −2 6
(1) G1 = .
6 −2 0 0
Although the single-shot Nash strategy is the limited Bottom-Right (BR), it is
possible to enjoy the Pareto superior outcome in the Top-Left (TL) corner if TL
becomes an equilibrium when the game is repeated infinitely often. But TL is
not a Nash equilibrium of a single-shot play, so a theoretical concern is to identify
what aspects of the game permit cooperation. A theme initiated here is to use a
recently developed decomposition of games (Jessie and Saari [3]) to identify the
precise source of a game’s cooperative structure, which arises only with repeated
2014
c American Mathematical Society
189
190 DANIEL T. JESSIE AND DONALD G. SAARI
There are, of course, many ways to decompose games. (For a description along
with references, see Jones [4].) While discussing these interesting approaches would
detract from the purpose of our paper, some comments are appropriate. For in-
stance, a commonly used choice is to decompose a game into its cooperative and
zero-sum components such as
6 6 0 4 6 6 2 2 0 0 −2 2
(2) = +
4 −4 2 0 0 0 1 1 4 −4 1 −1
This zero-sum decomposition is valuable for several purposes (e.g., bargaining solu-
tions), but it (as with other decompositions) can radically alter the game’s strategic
structure. This reality is illustrated with Eq. 2, where the game on the left has as
mixed and two pure (TL, BR) Nash strategies, but the zero sum component (second
bimatrix after the equal sign) has the dominant BR strategy. An important feature
of our decomposition is that it retains the original game’s strategic structure.
An advantage our decomposition has over a point-wise analysis is that it em-
phasizes the structure of the space of games (denoted by G). In this way, it becomes
possible to identify classes of games with desired properties rather than relying on
specific choices (a common approach) to carry out an analysis. In this way we
can determine whether obtained conclusions reflect properties of only specifically
selected games, or whether they identify a more general behavior. What under-
scores the importance of this concern is that this global approach also highlights
the potential sensitivity of results to small perturbations of a game. For instance,
we show below with an example that conclusions derived from a particular choice
of payoffs need not hold for other games even if their payoff entries are “close”.
Other advantages of the decomposition include demonstrating how non-
strategic information can affect conclusions in a repeated setting and examining
differences between the repeated and single-shot equilibria. This relationship be-
tween strategic and non-strategic information underscores the role of the discount
factor δ; as demonstrated, the δ discount factor interacts with changes in what
we call the game’s behavioral component. Finally, as this decomposition permits
comparing strategies, it can be used to identify all games for which a particular
strategy will “out-perform” another strategy.
1.1. Decomposition. While details, proofs, and the motivation for our de-
composition are in [3], a brief introduction is given here with the Eq. 1 game G1 .
The goal is to divide the game into the component G1N that contains all informa-
tion needed to determine all possible Nash strategic behavior, component G1B that,
from a behavioral perspective, contains no Nash information but can change the
dynamics of the game in other ways, and a kernel term G1K that merely modifies
entries. The important fact is that this decomposition is unique and can be used
with all games.
To find G N from a given game G, consider the two matrix entries left to a player
after specifying a pure strategy for each of the other players. With G1 , for example,
if L is specified for Player 2, then the two entries for Player 1 are “4” by playing
T and “6” by playing B. Replace each entry by how it differs from their average of
(4 + 6)/2 = 5. That is, replace the 4 with 4 − 5 = −1 and 6 with 6 − 5 = 1. Doing
this for all players and options leads to
−1 −1 −1 1
G1N = .
1 −1 1 1
192 DANIEL T. JESSIE AND DONALD G. SAARI
The kernel component G1K replaces each G1 entry for a player by the average of
all of the player’s G1 entries. For G1 , each player’s average is (4 − 2 + 0 + 6)/4 = 2.
The behavioral component is G1B = G1A − G1K . Thus, G1 = G1N + G1B + G1K where
3 3 −3 3 2 2 2 2
(3) G1B = , G1K =
3 −3 −3 −3 2 2 2 2
Notice; G1B and G1K contain no Nash information of any kind. This is because
the rows in each matrix are the same for the row player, the columns are identical
for the column player, so individual actions taken by a player cannot affect the
player’s G B payoff; selection of G B payoffs requires cooperative behavior. In turn,
this G A property means that G1N contains all of G1 ’s Nash information and G1B
contains all other aspects about G1 that affect behavior.
A theme developed below reflects the need to analyze disagreements between
the strategic component G N (where the Nash outcomes are based on individual
actions) and the behavioral G B component (where outcomes require cooperative
behavior). The G1 Prisoner’s Dilemma captures this conflict; the Pareto superior
entry of the behavioral G1B , which is TL, coincides with the location of the Pareto
inferior term for the strategic G1N . This disagreement imposes a stress between the
game’s cooperative and strategic interests.
To appreciate the effect of the mismatch, recall that individual strategic actions
affect only G1N outcomes; they cannot influence or attain the G1B desired outcome.
Thus a cooperative action (i.e., to ensure TL) must counter individual strategic
actions to extract the larger G1B outcome (which generates the larger G1 outcome).
More generally and for any game G, all collective and cooperative actions admitted
by G are strictly based on the G B structure and how it compares with G N . As such,
it is the G B component that plays a central role in addressing our described themes
of cooperation beyond strategic behavior.
The decomposition for a three player game, where each has two strategies, is
the same. To illustrate, consider
0 3 5 4 1 2 3 2 1 4 −2 2
(4) G: Front = , Back =
2 2 1 0 4 3 1 4 5 8 6 7
The “Front” and “Back” labels correspond to Player 3’s strategy choices. Al-
though not labeled, “Top/Bottom” and “Left/Right” refer to Player 1’s and Player
2’s strategies, respectively. So the payoffs for the three players should they play
Bottom-Left-Back, or BLBa, are 1, 4, 5, respectively.
To find G N , select a strategy for two of the players, and then replace the third
player’s identified entries by how they differ from their average. If T for Player 1
and L for Player 2 are specified, for instance, then Player 3’s two entries are 5 from
“Front” and 1 from “Back.” Replace each term with how it differs from the average
of 3. Doing so for all strategies and players, it follows that G N is
−1 1 2 2 −1 0 1 2 −2 −2 −2 0
Front = , Back =
1 −1 −2 −2 1 −2 −1 −1 2 2 1 2
COOPERATION IN n-PLAYER REPEATED GAMES 193
replacing the j player’s G entry by how it differs from κj . So the Front matrix
th A
for G B is
1− 11
4 2− 5
2 3− 13
4 2− 11
4 2− 5
2 2− 13
4
Front =
1− 11
4 3− 5
2 3− 13
4 2− 11
4 3− 5
2 5− 13
4
In general and by construction,
(5) G = GN + GB + GK .
The general form of each of these matrices is given next where the second
subscript refers to the player. Also, i βi,j = 0 for j = 1, 2, 3.
After all other players select their strategies, the remaining player, say the j th , has
two options with payoffs −|ηi,j | and |ηi,j |. To be a Nash equilibrium, this player
must select the larger |ηi,j | choice. This leads to the first Thm. 1.1 statement, where
the “strict” modifier just eliminates ηi,j = 0 terms.
Theorem 1.1. For games G involving n ≥ 2 players where each has two strate-
gies, the following are true:
(1) For game G, a strict pure Nash equilibrium occurs if and only if all of the
entries in the identified G N cell are positive.
(2) A game G can be constructed to have k pure Nash equilibria where k is
any integer satisfying 0 ≤ k ≤ 2n−1 .
(3) Mixed strategies are completely determined by the n2n−1 variables {ηi,j }
that define G N .
The third statement follows immediately from the structure of the G B and G K
terms; neither provide any strategic opportunity for any player. To illustrate with
a 2 × 2 game
5 5 1 1 1 2 −1 −2 1 1 −1 1 3 2 3 2
(7) = + + ,
3 0 3 2 −1 −1 1 1 1 −1 −1 −1 3 2 3 2
if the column player plays L-R, respectively, with probability q and (1 − q), then
the difference between the expected values for the row player of playing T and B
for the G N part of the game (the first bimatrix to the right of the equal sign) is
(8) E(T ) − E(B) = [q − (1 − q)] − [−q + (1 − q)] = 2q − 2(1 − q).
If the E(T ) − E(B) value is positive, the player should play T , if it is negative, the
player should play B. In particular, this expression leads to the column player’s
mixed strategy of q = 12 .
Now consider what happens with the original Eq. 7 game (the game on the
left); this difference in E(T ) − E(B) expected values is
E(T ) − E(B) = [5q + (1 − q)] − [3q + 3(1 − q)] =
(9)
2q − 2(1 − q) + [(4 − 4)q + (2 − 2)(1 − q)] = 2q − 2(1 − q).
The fact Eqs. 8 and 9 result in the same E(T ) − E(B) = 2q − 2(1 − q) equation
is not a coincidence. The reason is demonstrated with the bracketed term in the
penultimate equality of Eq. 9; it consists of terms from the G B and G K components.
By construction, these components always have the same value (for the row player)
for q th column and for the (1 − q)th column (see the above definitions for G B and
G K ), which means that these terms in the bracket must always cancel. What
remains in Eq. 9 are the ηi,j values that define Eq. 8.
The same structure holds for any number of players. With three players and
Eq. 6 where the column player plays L with probability q and the third player plays
F with probability r, the qr coefficient in the row’s player’s difference between the
expected values of playing T and B is
[η1,1 + β1,1 ] − [−η1,1 + β1,1 ] = 2η1,1
showing again (as asserted after Eq. 3) that the G B and G K components play no role
in determining Nash strategies. This leads to the third assertion of the theorem.
Before outlining a proof for the second statement, it is worth indicating how we
will use this construction. The outlined proof for the second assertion makes it easy
COOPERATION IN n-PLAYER REPEATED GAMES 195
to create a G N with any desired Nash structure. This permits creating a wide class
of games where each game has the same Nash structure, but different games exhibit
different cooperative behavior in repeated settings. To do so, design a desired G N
component that will be common for all of the games. Next, select appropriate G B
components, and add them (by using Eq. 5) to obtain G = G N + G B . (This is how
we created the Eq. 10 games in the next section.)
To illustrate, it can be difficult to create a class of four-player games where
each game has the same 24−1 = 8 pure Nash equilibria, but different games have
different cooperative behavior in a repeated setting. With the description given
next, the G N component is easily created; what remains is to find appropriate G B
components.
The second statement can be proved with an obvious induction argument. To
illustrate, we will construct four-player G N that has 24−1 = 8 pure Nash equilibria;
as it will become clear, these equilibria must have a diametric positioning in G N .
To start, TL is a Nash equilibrium for a two player G N game only if its two
entries are positive. According to the decomposition, this choice requires a negative
entry in the row player’s BL cell and the column player’s TR cell. With these
negative values, BR is the only possible cell that also could be a Nash equilibrium.
To make BR an equilibrium, select its two entries to be positive. In turn, this choice
determines all remaining entries of this G N .
To build a three player G N game, start with the above TL and BR structure
of the two person game in the Front set of matrices. For TLF and BRF to be
strict Nash equilibria, it is necessary to assign a positive entry for player 3 in
these G N cells. This choice and the structure of ηi,3 properties require Player 3 to
have a negative entry in TLBa and BRBa; the entries for Players 1 and 2 for this
Back choice are free to be selected. The only way to obtain two more equilibria
(which must be in the Back matrix) is to place positive entries for all players in
the remaining two G N Back cells (that is, TRBa and BLBa). Once these choices
are made, the entries for all cells are determined. Only the selected ones have all
positive entries, so the four equilibria are located at TLF, BRF, BLBa, and TRBa.
A four player game is given by two sets, I and II, of “Front-Back” matrices.
Namely, it consists of two sets of matrices with the Eq. 4 form except that each
cell lists the payoffs for each of the four players. To have four equilibria in set
I, put a positive value for Player 4 in each of TLFI, BRFI, BLBaI, and TRBaI.
This forces a negative value for Player 4 in each of the TLFII, BRFII, BLBaII,
and TRBaII cells. Select positive ηi,j values for the remaining four set II cells; this
determines the remaining entries and creates a G N with eight pure Nash equilibria.
More generally, moving from (n − 1) to n players doubles the maximum number of
pure Nash equilibria.
The mixed Nash equilibria for the above four-person game are found in the
standard way with these 32 ηi,j variables. To construct a wide class of games with
this same Nash structure but potentially different equilibria for repeated games,
add appropriately selected G B terms to the constructed G N .
1.3. Three examples. To introduce the types of games, strategies, and un-
usual conclusions that are analyzed in this paper, consider the three Eq. 10 games.
196 DANIEL T. JESSIE AND DONALD G. SAARI
0 0 0 4 4 4 5 5 4 7 9 8
G2 Front = , Back =
4 4 5 8 8 7 9 7 9 11 11 11
7 7 7 5 11 5 4 4 11 0 8 9
(10) G3 Front = , Back =
11 5 4 9 9 0 8 0 8 4 4 4
7 4 5 5 8 7 4 7 9 0 11 11
G4 Front = , Back =
11 7 4 9 11 0 8 0 8 4 4 4
To connect these three games with the discussion of the previous section, they were
designed by first selecting the common G N given in Eq. 11, which has the dominant
BRBa strategy. All differences among these three Eq. 10 games, then, completely
reflect the choices for the G B component as described in Eq. 12. Whatever choice is
made for G B , the dominant Nash strategy for each of these games remains BRBa.
(11)
−2 −2 −2 −2 2 −2 −2 −2 2 −2 2 2
GiN Front = , Back =
2 −2 −2 2 2 −2 2 −2 2 2 2 2
(12)
−4 −4 −4 0 −4 0 1 1 −4 3 1 0
G2B Front = , Back =
−4 0 1 0 0 3 1 3 1 3 3 3
3 3 3 1 3 1 0 0 3 −4 0 1
G3B Front = , Back =
3 1 0 1 1 −4 0 −4 0 −4 −4 −4
3 0 1 1 0 3 0 2 1 −4 2 3
G4B Front = , Back =
3 2 0 1 2 −4 0 −4 0 −4 −4 −4
6 6 6 6 6 6 6 6 6 6 6 6
(13) GiK Front = , Back =
6 6 6 6 6 6 6 6 6 6 6 6
A first objective is to determine which states for each game can be sustained
in a repeated setting. For G2 , the answer is immediate: Each player has a domi-
nant strategy as given by BRBa. As BRBa also yields the game’s Pareto superior
outcome, by playing the dominant strategy, each player receives the game’s highest
possible payoff.
Results for G3 are not as obvious. Each player has a dominant BRBa strategy,
but the game’s payoff of “4” for each player is Pareto inferior to the TLF, TRF,
TLBa, and BLF outcomes. It is reasonable to wonder whether any of these four
choices can be sustained should each player adopt, say, a grim-trigger strategy.
What makes this question of particular interest is that the TLF entry, which is
strictly preferred to BRBa, creates a three-person Prisoners’ Dilemma. A natural
question is to determine whether cooperative behavior of playing T, L, and F can be
sustained with grim trigger or tit-for-tat. As we show, if the discount factor satisfies
δ > 47 , then grim trigger does support TLF. Even stronger, with 47 < δ ≤ 45 , TLF
is the only outcome that grim trigger can sustain. But if δ > 45 , then TRF can be
sustained as well. The next question is whether larger δ values could support BLF
or TLBa: They cannot; no other state is inducible under grim trigger.
Notice how this conclusion identifies an advantage for Player 2 over the other
players. This is because the two sustainable grim trigger options for δ > 45 are
COOPERATION IN n-PLAYER REPEATED GAMES 197
TLF and TRF: Only Player 2 can select between them. Consequently, Player 2 can
receive the personally higher TRF payoff at the expense of the other two players.
This result supports one of our themes: Features, properties, and conclu-
sions that are derived from two-person games need not transfer to three-person
games. The particular feature illustrated here is that, with a two-person Prison-
ers’ Dilemma game, if cooperation is ensured for a δ1 value, then cooperation is
ensured for δ ≥ δ1 . But, this property need not transfer to three-person Prisoners’
Dilemma games. Instead, and as demonstrated with G3 , even if a particular δ value
(here, 47 < δ ≤ 45 ) can sustain cooperation among all three players (to attain TLF),
this cooperation can disappear with a larger δ value (here δ > 45 )! This larger δ
value makes it to the second player’s advantage to defect from the cooperative TLF
strategy to obtain the personally preferred outcome from TRF; this defection is at
the expense of the other two players. Of interest is how classes of games with this
feature can be identified by using the decomposition and analysis developed below.
Turning to the last game, while the dominant strategy outcome for G4 is BRBa,
the TLF, BLF, TRF, and TLBa choices have Pareto preferred outcomes (where
TLF again generates a three-person Prisoners’ Dilemma). A reason G4 appears to
be the most complicated game is that it does not have a clear optimal state. Indeed,
this complexity is reflected by the number of calculations that typically need to be
performed in a standard analysis. However, by using the method outlined below, it
can be quickly shown that, for any δ ∈ (0, 1), it is impossible to induce a cooperative
state with a grim-trigger strategy. (“Grim trigger” is emphasized only to simplify
computations.)
These three games illustrate different interesting features that are strictly due
to behavioral G B terms in Eq. 12. This must be the case because games G2 , G3
and G4 are strategically equivalent in the sense that G2N = G3N = G4N . This equality
means that all possible differences among these games in achieving cooperation are
strictly caused by differences in the games’ behavioral G B components.
To describe the G B differences, the G2B choice has its Pareto superior entry
agreeing with the location of the G2N Pareto superior entry; this agreement ensures
that G2 has a dominant strategy that also is the Pareto superior outcome. In
contrast, the G3B choice has its Pareto superior entry located at the position of the
G N Pareto inferior entry, and diametrically opposite the G N Pareto superior term;
the features of this G3B choice are what allow only two sustainable grim trigger
cooperative outcomes. Finally, the choice of G4B was selected so that its features
prohibit any grim trigger sustainable cooperative outcome.
Another one of our other themes is captured by the similarity of games G3
and G4 : Player 1’s entries in both games are identical, and there are only minor
differences in the entries for the other two players. The point to be made is that
while the game matrices are similar, the cooperative structures with repeated play
differ significantly in that G3 can ensure cooperation, but G4 cannot.
As this example makes clear, changes in G B , even small changes, can alter
what kinds of cooperation are possible. Indeed, the two games could have been
made to be much closer in appearance, but we opted to use choices where all
terms are integers. Stated in mathematical terms, by identifying the boundary of
G B choices that separates different repeated game behaviors, the entries for the
two games can be selected to be arbitrarily close. As such and as asserted in our
198 DANIEL T. JESSIE AND DONALD G. SAARI
2. Sustaining Cooperation
Our next goal is to show how a game’s G B behavioral structure can change the
game’s cooperative properties. To prevent this discussion from becoming obscured
with technical computations, we consider only the commonly used grim-trigger and
tit-for-tat strategies. (Interested readers may wish to develop the properties of other
strategies.) Also, our intent differs from the usual objective; it is to determine what
happens in a repeated setting with all possible non-Nash outcomes. In this more
general way, our analysis includes as a special case the typical choice of exploring
whether a Pareto optimal outcome can be sustained.
The Eq. 10 games illustrate a feature of our more general perspective: While
G2 and G3 have clear candidates for the cooperation state, BRBa for G2 and TLF
for G3 , it is not clear what should be selected for G4 . With G4 , for instance, the
three states of TRBa, TRF, and BLF are Pareto equivalent. But by characterizing
all sustainable outcomes in any game (independent of their Pareto properties), this
selection problem can be ignored; the choice becomes a special case of checking what
positive results emerge from the analysis. Namely, our more general approach of
examining all non-Nash outcomes sheds light on differing ways to define what is an
optimal outcome.
By characterizing which states are inducible in a repeated setting, we also
identify the role played by G B in seeking cooperative outcomes and how the choice
of G B can change the role of G N in the analysis. To simplify the calculations,
the punishment state is given by a pure-strategy outcome. An interesting feature
is that the punishment state need not be an equilibrium of the single-shot game.
This flexibility allows for a minmax strategy against Player i to be i’s punishment.
2.1. The effect of the G B component. It is instructive to observe how the
G B structure changes the role played by G N for a given game G. Should G B have
a minimal impact on G N (e.g., the magnitudes of the G B entries are small in size
(relative to G N ) or G B supports the G N structure by enhancing a Nash equilibrium
(as true with G2B ), then G B plays a minimal role in the G game analysis. Here,
most of what is interesting in the game can be attributed to the G N component;
this is where what individuals obtain (the Nash equilibria) is determined by their
own strategic actions. But a stronger G B term, such as with G3B , provides an
inducement for the players to try to obtain a Pareto preferred outcome. Namely,
ways must be found to extract the benefits that are provided by the G B component:
Doing so requires a level of cooperation among the players.
Our goal is to determine what it takes to ensure cooperation that will, for any
reason, support any specified non-Nash outcome for any game. For the desired
cooperation to be feasible, each player must receive a better outcome with the
targeted outcome than what the player could get from at least one other setting.
No assumption is made about the structure of this comparison, so this analysis
includes, as a special case, the Prisoners’ Dilemma (where each player’s comparison
entry is determined by the dominant strategy). In particular, the targeted outcome
COOPERATION IN n-PLAYER REPEATED GAMES 199
need not be Pareto superior to the various Nash equilibria (Thm. 1.1); all that is
required is that the targeted outcome is better than some other outcome. Thus
this discussion applies to a wide selection of games.
A feature described below (Sect. 2.2) is that the effort needed to support a
non-Nash outcome always is accompanied with a temptation for certain players to
renege from cooperating. (This feature holds the Prisoner’s Dilemma, but it is not
obvious that it is true in general.) A reneging player could destroy the designated
cooperative outcome, so, if the targeted outcome is desired by the other players,
they must adopt some form of collective action to enforce cooperation. A natural
approach is to make it expensive to renege. Doing so may not be possible in a
one-shot game, but the added opportunities offered by a repeated game may make
it an option. Strategies such as the grim trigger and tit-for-tat share this objective;
they differ in their efficiency and effectiveness.
To make “reneging” costly, a game must have a sufficiently distasteful aspect
that can be converted into an enforcement tool; the goal is to have the non-
cooperative player suffer this distasteful punishment. The question is whether a
game always provides such enforcement tools. Surprisingly, it always does; this
enforcing tool comes from the structure of G N !
In summary, for any given game G, the role played by G N is influenced by the
G structure. When G N is the dominating component of the game, it enjoys a
B
positive, rewarding image of determining what should happen. But with changes
in a game due to G B inducements, the image of G N now changes from describing
rewards to that of providing a means for punishment for the lack of cooperation.
The answer depends on the effect of the future penalties on Player 1 as determined
by the player’s discount factor δ ∈ (0, 1).
Theorem 2.1. In an n ≥ 2 player game, suppose with a targeted, non-Nash
cooperative outcome, Player 1, with discount rate δ ∈ (0, 1), must select a strategy
yielding the smaller of {−|ηi,1 |+βi,1 , |ηi,1 |+βi,1 } as identified in Eq. 14. To enforce
this cooperative action, the other (n − 1) players use the grim trigger option that
would force Player 1 to select between |ηj,1 | + βj,1 and −|ηj,1 | + βj,1 . It is in Player
1’s interest to cooperate if
' (
2
(15) 1− |ηi,1 | + βi,1 > |ηj,1 | + βj,1
δ
If the other players adopt a tit-for-tat strategy, then it is in Player 1’s interest
to cooperate if
' (
2
(16) −1 − |ηi,1 | + βi,1 > −|ηj,1 | + βj,1
δ
Proof. Suppose Player 1 is facing opponents who have a implemented a grim
trigger in which he can play Top (Cooperate) for a payoff of −|ηi,1 | + βi,1 ad
infinitum, or play Bottom (Defect) for a one-time payoff of |ηi,1 | + βi,1 . But if our
player did not cooperate, the grim trigger ensures that the subsequent payoffs are
|ηj,1 | + βj,1 ad infinitum. This means that Player 1 will cooperate if
∞ t−1
t=1 δ (−|ηi,1 | + βi,1 ) > (|ηi,1 | + βi,1 ) + ' ∞
t=2 δ
t−1
( | + βj,1 ),
(|ηj,1
−|ηi,1 | + βi,1 |ηj,1 | + βj,1
> |ηi,1 | + βi,1 + δ ,
(17) 1−δ 1−δ
−|ηi,1 | + βi,1 − (1 − δ)(|ηi,1 | +βi,1 ) > δ(|ηj,1 | + βj,1 ),
(δ − 2)|ηi,1 | + δβi,1 > δ(|ηj,1 | + βj,1 )
After some algebraic computations, Eq. 15 is obtained.
If Player 1 is faced with opponents who are playing tit-for-tat, then the grim-
trigger in Eq. 15 must hold or Player 1 will never cooperate, as he retains the
ability to defect ad infinitum against. However, there is also the possibility of first
cooperation, then defection, then cooperation, etc. In order for cooperation to hold
in this case, it is also needed that
∞
δ t−1 (−|ηi,1 | + βi,1 ) > |ηi,1 | + βi,1 + δ (−|ηj,1 | + βj,1 )
t=1
+ δ 2 (|ηi,1 | + βi,1 ) + δ 3 (−|ηj,1 | + βj,1 ) + · · ·
∞
∞
∞
δ t−1 (−|ηi,1 | + βi,1 ) > δ 2(t−1) (|ηi,1 | + βi,1 ) + δ δ 2(t−1) (−|ηj,1 | + βj,1 )
t=1 t=1 t=1
' (
−|ηi,1 | + βi,1 |ηi,1 | + βi,1 −|ηj,1 | + βj,1
> +δ
1−δ 1 − δ2 1 − δ2
' ( ' (
1 + δ −|ηi,1 | + βi,1 |ηi,1 | + βi,1 −|ηj,1 | + βj,1
> +δ
1+δ 1−δ 1 − δ2 1 − δ2
(1 + δ)(−|ηi,1 | + βi,1 ) > |ηi,1 | + βi,1 + δ (−|ηj,1 | + βj,1 )
(−δ − 2)|ηi,1 | + δβi,1 > δ (−|ηj,1 | + βj,1 )
This inequality is equivalent to Eq. 16.
COOPERATION IN n-PLAYER REPEATED GAMES 201
Equations 15, 16 completely characterize all pairs of states (the targeted and
the penalty states) in which cooperation is inducible against strategic interests with
either a grim trigger or a tit-for-tat strategy, respectively. Also, for δ ∈ (0, 1), the
left hand side of these equations is unbounded below whenever ηi,1
= 0, so all
statements must be qualified with suitable conditions on the discount factor.
As Eqs. 15 and 16 show, equilibrium outcomes in repeated game are affected by
βi,1 , βj,1 values; this information is totally ignored in the single-shot case. This is
because βi,j values are irrelevant when computing a single-shot Nash equilibrium.
(See the discussion following Thm. 1.1.) On the other hand and as demonstrated
in these two equations, βi,j values have a large impact on whether or not cooper-
ation can be induced in repeated games. Also note that the phrase “for δ large
enough” could be substituted with “for non-strategic interests large enough” (i.e.,
“for sufficiently large βi,1 values”), or “for strategic factors small enough” (i.e., “for
sufficiently small ηi,1 values”), when describing results on cooperative outcomes.
This analysis captures one of our goals, which is to indicate how the game
structure – the level of inducement given by G B – can be modified to attain and
sustain cooperation. Furthermore, these equations also identify the type of in-
formation that parameter δ measures; as δ is a coefficient only for the ηi,j terms,
non-strategic interests (i.e., G B terms) can be altered without affecting the standard
requirements of a “suitable δ.”
2.3. An application. To support our assertions about the three Eq. 10 games
by using Thm. 2.1, notice that when |ηi,1 | = |ηj,1 |, the inequalities in Eqs. 15
and 16 are identical. It is only when |ηi,1 | > |ηj,1 | that the tit-for-tat inequality
imposes a greater restriction. If |ηi,1 | < |ηj,1 |, then the grim trigger inequality is
stricter; it must also hold for the tit-for-tat strategy to sustain cooperation. This
demonstrates that the strength of the inducement strategy is a function of the G N
strategic structure of the game; in some games, a tit-for-tat strategy is not stronger
than a grim-trigger strategy. For the games in Eq. 10, the two are equivalent, which
is clear with the Eqs. 11, 12 decompositions.
To determine which states are equilibria in a repeated setting for games G3 and
G4 , first note that the punishment for defection can be assumed to be BRBa as this
is both the minmax against each player and the Nash equilibrium with the lowest
payoff. Because ηi,j = −2 for all i, j, Eq. 15 gives
4 4
(18) 2− + βi,j > 2 − 4 ⇒ βi,j > −4 +
δ δ
This inequality makes it easy to determine the equilibria states; just check
whether the βi,j values satisfy this inequality. In particular, if δ > 47 , then a state
is sustainable if βi,j ≥ 3 for every Player j. This value holds only for the TLF
outcome in G3 . Increasing δ values make it easier to sustain cooperation; indeed,
for δ > 45 , a state is sustainable if βi,j ≥ 1. This means that TRF in G3 also is an
equilibrium. The limiting case as δ → 1 is βi,j > 0. No other outcomes in either G3
or G4 has each player with βi,j > 0, so there are no more equilibria. So, searching
for all equilibria in a repeated setting can simplify the computations. Namely, by
developing a characterization of all states that are sustainable in a repeated setting,
and doing this in terms of the structure of the game, the amount of calculating can
be substantially reduced.
202 DANIEL T. JESSIE AND DONALD G. SAARI
These examples highlight the importance of the βi,j values in determining the
repeated game equilibria. This relationship is captured by the following theorem
where Gi ∼N Gj means that GiN = GjN . Also recall that G is the space of all games;
in Thm. 2.2, G can be the set of all 2 × 2 games or the set of all 2 × 2 × 2 games,
but the conclusion holds even if G is the set of all 2 × · · · × 2 games with n agents.
Theorem 2.2. For any G ∈ G, any δ ∈ (0, 1), any choice of cooperation
outcome, and any different choice of punishment outcome, there exists a game G ∈
G such that G ∼N G and the cooperation outcome can be sustained in G with
either a tit-for-tat or grim-trigger strategy. Furthermore, the payoffs in G can be
restricted to any interval in R.
Proof. For any given G, the βi,j values can be adjusted freely. Eqs. 15 and
16, as well as the Nash component G N , are invariant under both positive scalar
multiplication and addition by a constant to all of Player i’s payoffs.
The significance of Thm. 2.2 is to demonstrate the importance of the actual
magnitude of values that are in repeated games. Often, inspiration for a repeated
game comes from a story about the Nash equilibrium structure of the single-shot
game. But as this theorem shows, information from a single shot analysis is insuf-
ficient to determine what might happen in a repeated game analysis. Furthermore,
Thm. 2.2 shows that there can be hidden difficulties in picking a “representative
game” from an equivalence class of games, such as a reduction of a game to its
ordinal information. An alternative approach is to use the Eq. 15, 16 inequalities
to identify classes of relevant games in which a desired outcome holds.
2.4. Encouraging cooperation. In the introductory comments it was men-
tioned that the behavioral structure of a game G B can be altered to encourage
cooperation in the repeated setting without affecting the strategic structure G N .
Information about how to do so follows directly from Eqs. 15 and 16.
The approach is immediate; increasing the βi,1 − βj,1 value improves the in-
centives for Player 1 to cooperate. For instance, as G4 does not have a sustainable
equilibria in a repeated game, can G4 be altered without affecting its strategic
structure so that the resulting repeated game now has such an equilibrium?
To illustrate how to do so, select a non-Nash entry from G4 ; any one will do.
Choose, for instance, BLF where β2,2 = 2 and β3,3 = 0. (For a reminder of the βi,j
notation, see Eq. 6.) With the given Nash structure, a way to make this equilibrium
sustainable in a repeated setting is to increase both β values to at least 3 (Eq. 18).
But changing β2,2 from 2 to 3 changes the second player’s G B entires in BLF and
in BRF, and changing β3,3 from 0 to 3 changes the third player’s G B entries in
BLF and in BLBa. In other words, these inducements must be of the kind that are
available whether the player cooperates, or caves to temptation.
The nature of these β terms depends upon what is being modeled. If, for
instance, the payoffs related to the G4N component represent what can happen with
military action, the βi,j entries may represent added benefits for the player under
peace. In fact, in practice, while the strategic G N aspects of a game may remain
fixed, negotiations typically involve changing a game – changing the payoffs – to
induce players to cooperate. These changes, then, are reflected by what it takes to
alter the G B portion of a game to attain cooperation.
A second issue is to determine what is needed to encourage cooperation with
players who have less regard for the future as captured by a smaller δ value. With
COOPERATION IN n-PLAYER REPEATED GAMES 203
3. Cooperation Inducement
Aside from payoff specification, a difficulty in analyzing repeated games comes
from having additional players. As it was pointed out in Sect. 1.3 when discussing
properties of G3 , results obtained from simplifying a situation to two players do not
necessarily reflect what occurs in a more realistic setting with added players.
In [1], Brams and Kilgour analyze the cooperation inducements in a 2-player
repeated setting. They show that by giving one player the common knowledge
ability to receive a noisy signal of the opponent’s strategy before play, a probabilistic
tit-for-tat strategy can induce a cooperative outcome in a significant proportion of
games. A natural question is whether this kind of conclusion extends to 3 players.
In a 2-player setting, giving Player 1 the common knowledge ability to observe
the opponent’s strategy decreases, but does not eliminate, the incentive for defec-
tion: if Player 2 defects, the likelihood of receiving a higher one-time payoff is no
longer certain, but the punishment in the following round remains. To see this,
consider the following simplified 2-player game consisting only of Player 2’s payoffs
With this structure, Player 2 has the dominant strategy of R. So, assume that the
cooperative strategy for the game is TL. Without a detection mechanism, Player 2
will cooperate against a tit-for-tat strategy whenever Eq. 16 holds.
Now suppose there is a detection mechanism that reduces the probability of
reaching the state TR where Player 1 cooperates and Player 2 defects. Brams and
Kilgour provide a detailed discussion about the nature of detection mechanisms
and the different types of errors, but here it is assumed that the effectiveness of the
mechanism is captured by the p value in
This simple best-case1 detector shows how the mechanism can change the in-
centives for a player. Here, cooperation is preferred to defection ad infinitum if
∞
δ t−1 (−|η1,2 | + β1,2 ) >(1 − p)(|η1,2 | + β1,2 ) + p(|η2,2 | + β2,2 )
t=1
(20) ∞
+ δ t−1 (|η2,2 | + β2,2 )
t=2
The difference between Eqs. 17 and 20 is the first term on the right hand side: In
Eq. 17, the defector received the full benefit of |η1,2 |+β1,2 , but in Eq. 20, the benefit
of defection is reduced by the (1 − p) probability of securing the higher benefit with
the lower expected value of (1 − p)(|η1,2 | + β1,2 ) + p(|η2,2 | + β2,2 ). Indeed, with
no detector (so p = 0), Eqs. 17 and 20 are equivalent. As Eq. 20 also shows, as
the detector increases in accuracy (so p → 1), the expected benefit from defection,
(1 − p)(|η1,2 | + β1,2 ) + p(|η2,2 | + β2,2 ), decreases to (|η2,2 | + β2,2 ).
The detection mechanism, then, makes cooperation more likely by lowering the
benefit of defection. Because there are no qualitative differences comparing this
analysis to the non-detection mechanism case, calculations using the decomposi-
tion provide a simpler way to analyze extensions to the standard setting, and to
characterize differences between mechanisms. For instance, the detection mecha-
nism works by the single substitution in Eq. 20, which means that other mechanisms
can be contrasted by checking how they differ in terms of this modification.
3.1. Iran, Israel, and the United States. A feature developed above is how
adding a third player expands the possibilities of what can happen with repeated
interactions. This attribute makes it worth considering the interaction between
Iran and Israel discussed in [1], but now including the United States.
In this game, Iran has a dominant strategy to develop nuclear capacity (defect),
where Israel and the U.S. are seeking to sustain the cooperative outcome in which
Iran does not advance its nuclear ability. As Brams and Kilgour demonstrate,
a strategy detection mechanism for Israel in the two-player game induces Iran’s
cooperation against Iran’s credible tit-for-tat strategy.
The need is to understand whether Iran might gain new options with the in-
clusion of the United States, given the detection mechanism possessed by Iran’s
opponents. What situation actually obtains depends upon the strategic structure,
where Iran has a dominant strategy, but also upon the particular payoffs chosen to
represent the game. Because of this, only qualitative differences are discussed to
highlight problems in generalizing results that are based on simplified scenarios.
As we have shown, a characteristic of dominant strategy games with more than
two players is that there can be more than a single “cooperative” outcome. (In
what follows, for a two-player game, the ordering of players is (Israel, Iran); for three
players it is (Israel, Iran, US).) That is, even if (C, C, C) is possible, it might also be
possible to sustain (C, D, C) (where Iran defects) even if the analogous two-player
situation (C, D) is unrealizable. In terms of the case study, this possibility depends
on the relationship of payoffs between Israel and the U.S., which is ignored in the
two-player case. As illustrated with G3 , the (C, D, C) outcome can be sustainable
1 This is a best-case detector because having P r(signal Left|played Left) < 1 would decrease
the incentive for cooperation by including the possibility of Player 1 defecting against Player 2’s
cooperation.
COOPERATION IN n-PLAYER REPEATED GAMES 205
even if both Israel and the U.S. have strategic incentives to play D. With G3 , each
player has a dominant strategy, but there are multiple sustainable states. ( If Iran’s
strategy could, with common knowledge, be observed before the U.S. and Israel act,
perhaps the likelihood of (C, D, C) could be reduced, if not eliminated all together.)
The advantage of the detection mechanism in the two player case is that
the dominant strategy of defection becomes less profitable, so the advantage lies
with Israel. With multiple sustainable states, however, the detection mechanism
may assume another use that could benefit Iran: It could become a signaling de-
vice. Namely, Iran could signal in advance which equilibrium it prefers, perhaps
(C, D, C). If we take the commitment to develop nuclear capacity even in the face
of international sanctions as a pre-commitment to defection ad infinitum, which is
not an unreasonable interpretation, Iran is playing the game of equilibrium selec-
tion, acting on the belief that (C, D, C) is sustainable. An alternative possibility is
where the US payoff structure tolerates a higher level of Iranian nuclear capability
or places a different premium on cooperative benefits (βi,j terms) than accepted by
Israel to make (C, D, C) a sustainable equilibrium in a repeated setting. Such an
equilibrium would be manifested by continued Iranian nuclear activity (the D strat-
egy), relaxed sanctions (tit-for-tat punishment is not needed for an equilibrium)
even with, perhaps, Israeli objections (reflecting that (C, D) is not a two-player
equilibrium). The Sect. 2.4 description suggests how to use G B to further modify
the game (e.g., through negotiations) to attain other equilibria.
That this signaling feature can arise in games with multiple cooperation states
was recognized and side-stepped in [1] by assuming one player had a dominant
strategy. With more players, however, this strategic assumption no longer remains
a valid way to avoid the problem, which means that these 2 × 2 results cannot be
generalized beyond the two-player case. The reason for limiting the analysis to two
players was to avoid the increasing number of possible games in the combinatorial
approach of Robinson & Goforth [5]. Instead of this method, the foundation for
analysis can be simplified to Eq. 16 or 20 by using the strategic decomposition.
4. Conclusion
Although the concerns raised here about the typical approach to analyzing
repeated interactions are known, there was no efficient method of addressing them.
By applying the decomposition to repeated games, analysis can be extended from
isolated cases to classes of games sharing a property of interest. Furthermore,
the decomposition can be used not only to analyze payoff structures, but also the
different strategies and inducement mechanisms used in a repeated setting.
References
[1] Brams, S., and Kilgour, D., Inducible Games: Using Tit-for-Tat to Stabilize Outcomes, Work-
ing Paper
[2] Hopkins, B., “Exploring and expanding the Robinson Goforth system for 2×2 games,” Presen-
tation, Special Session on Mathematics of Decisions, Elections, and Games, annual conference,
American Mathematical Society, Jan. 11, 2013, San Diego, CA.
[3] Jessie, D., and Saari, D., Strategic and behavioral decomposition of 2 × 2 × · · · × 2 Games,
Technical Report, April 2013, Institute for Mathematical Behavioral Sciences, University of
California, Irvine.
[4] Jones, M. A., Nash Equilibrium (pure and mixed), Sect. 3.3.3.2 in Encyclopedia of Operations
Research and Management Science (Ed. by T.A. Cox, Jr.), Wiley Science, January 2011.
206 DANIEL T. JESSIE AND DONALD G. SAARI
[5] David Robinson and David Goforth, The topology of the 2 × 2 games, Routledge Advances
in Game Theory, Routledge/Taylor & Francis Group, New York, 2005. A new periodic table.
MR2547198 (2011c:91003)
1. Introduction
The Talmud is a collection of ancient texts that document and interpret Jewish
criminal, civil, and religious law. The 2000-year-old Babylonian Talmud gave three
examples of instances in which an estate is divided among creditors whose claims
sum to more than the estate; these examples are commonly referred to as bank-
ruptcy problems. A more general way of allocating an estate among three or more
creditors was not described in the Talmud leaving Rabbinic scholars (and eventu-
ally game theorists) to wonder how the allocations in the example were determined.
Of the three instances, the first case (first column in Table 1) in which an estate
of 100 (of your favorite monetary unit) was divided among three claimants with
claims of 100, 200, and 300 made sense because it was evenly divided among the
three creditors, despite their different claim sizes. The third case (third column in
Table 1) in which 300 was to be divided among the creditors was appealing because
it coincided with allocating the estate proportionally based on the amount owed
each creditor. The real riddle was whether a single rule was used to generate the
data (as opposed to separate rules being used for different sizes of the estates) and
how a single rule could describe the extremes of the first and third cases, as well as
the less intuitive middle case.
The Babylonian Talmud was clearer on how to solve similar bankruptcy-type
problems when there are only two claimants, an approach now known as the Con-
tested Garment rule. The data in Table 1 has an interesting property as it relates
to the Contested Garment rule. If the amount awarded to any two of the three
2014
c American Mathematical Society
207
208 MICHAEL A. JONES AND JENNIFER M. WILSON
Estate
Claims 100 200 300
100 100/3 50 50
200 100/3 75 100
300 100/3 75 150
claimants is summed and then re-allocated to the two claimants using the Con-
tested Garment rule, then the two claimants would receive their original amounts.
For example, in the second column of Table 1, claimants 1 and 2 (with claims of
100 and 200, respectively) would receive 50 and 75, respectively, if they were to use
the Contested Garment rule to divide 125 (= 50 + 75) based on their respective
claims, matching the data from the table.
One can ask: What would the claimants do if the amounts allocated to one or
more pairs of claimants did not match what they would receive under the Contested
Garment rule? How could the claimants adjust the claims to arrive at an n-claimant
solution? In response to these motivating questions, we introduce a dynamic process
based on averaging the k-claimant solution over all sets of k players and show that
for certain bankruptcy rules the process converges to the n-claimant solution. Our
process generalizes a dynamic approach introduced by Dagan and Volij [5].
Others have been concerned both with how knowledge of a 2-claimant solution
can be used to determine the n-claimant solution (described as a notion of consis-
tency) and how dynamic processes can be used to converge to n-claimant solutions
for bankruptcy problems and, more broadly, for cooperative games. We provide
an extensive literature review in Section 2. In Section 3, we define the bankruptcy
problem and review a number of well-known bankruptcy rules, including the TAL
rules and the Minimal Overlap rule. We also discuss what it means for a rule
to be consistent. In Section 4, we introduce the k-averaging dynamic process for
bankruptcy rules and show that if the rule is consistent, then the n-claimant solu-
tion is the unique attractive fixed point of the dynamic process. We also provide
an alternate proof of this result for the Proportional rule and consider an exam-
ple that shows that an n-claimant solution may not be a fixed point if the rule is
not consistent. In Section 5, we provide an alternate proof of the convergence of
the 2-averaging dynamic process for TAL rules. At the beginning of this section,
we also discuss when this convergence occurs in a finite number of iterations, and
when the convergence is asymptotic; this is formalized in Proposition 5.10. Section
6 provides concluding remarks about how to extend the dynamic process to other
network topologies.
2. Literature Review
The formal study of bankruptcy rules originated in the foundational papers
by O’Neill [16] and Aumann and Maschler [2]. O’Neill [16] introduced the bank-
ruptcy problem through two passages from the Babylonian Talmud. He examined
the assumptions behind a solution proposed by Rabbi Ibn Ezra, and proposed an
amendment in the case when no player claims the entire estate. He compared this
THE DYNAMICS OF CONSISTENT BANKRUPTCY RULES 209
amended solution, now called the Minimal Overlap rule, to the results obtained
using other solution methods including the method of Random Claims, the Pro-
portional rule, and allocations based on non-cooperative game models (the Nash
equilibrium) and cooperative game models (the Shapley value). He also introduced
a notion of consistency which is slightly different from the current use of the term.
Under this notion, for a fixed solution method, each player in turn is awarded their
full claim, and the remaining estate is allocated to the remaining players using the
fixed method. The results are averaged for each player, and the method is called
consistent if each player’s average is equal to the amount they would have received
under that method if applied to the original problem.
Aumann and Maschler [2] formally defined the bankruptcy problem, and il-
lustrated a couple of proposed solutions to bankruptcy problems in the Talmud,
including the Contested Garment rule, as well as the data in Table 1. They defined
a consistent solution to a bankruptcy problem as one that for each pair of players
reduces to the Contested Garment rule, and showed that each bankruptcy problem
has a unique consistent solution: what is now called the Talmud Rule. They showed
how the Talmud rule can be viewed as a combination of two other bankruptcy rules:
the Constrained Equal Awards and Constrained Equal Losses rules. For a rule to
satisfy their definition of self-consistency (now called consistency), its restriction to
any subset of players yields the same solution as the bankruptcy problem restricted
to that subset. Finally, they modeled a bankruptcy problem as a cooperative game,
and show that the Talmud rule is equal to the nucleolus of that game. This was
somewhat surprising, because the nucleolus was only defined by Schmeidler [17] in
1969, yet the Talmud rule dates back thousands of years.
Others have proposed and analyzed other solutions for bankruptcy problems.
Hokari and Thomson [7] proposed a class of consistent rules that associate a weight
with each player to model situations in which some bias or priority in favor of some
players is desired. Moreno-Ternero and Villar [15] proposed a one-parameter fam-
ily of rules, called the TAL rules, that generalizes the Talmud, Constrained Equal
Awards and Constrained Equal Losses rules. They also verified that the TAL rules
satisfy a number of properties, including consistency, continuity and anonymity,
estate and claims monotonicity, and that these rules are order-preserving and ho-
mogeneous. Moreno-Ternero [14] defined a generalization of the TAL family in the
context of taxation problems (where the goal is to distribute a tax burden, and in-
dividuals’ taxable incomes serve as upper bounds for their allocations), which fixes
minimum and maximum tax rates. Thomson [20] defined two classes of bankruptcy
rules, the ICI and CIC rules. The ICI class contains the TAL family and the Min-
imal Overlap rule; the CIC class contains Constrained Equal Awards, Constrained
Equal Losses, as well as the Reverse Talmud rule. Thomson identified properties
of each family and their consistent subclasses. Other solutions can be obtained
by modeling the bankruptcy problem as a bargaining problem, a non-cooperative
game or a cooperative game, and applying known solution methods. See Thomson
[19] for a survey that focuses on axiomatic treatments.
The concept of consistency, which requires that a bankruptcy rule coincide
with its restriction to arbitrary subsets of players, has played a fundamental role in
axiomatizations and interpretations of bankruptcy problems. Although consistency
has been used as a mechanism for applying the bankruptcy rule, Thomson [22]
argued that consistency can best be understood as an expression of fairness by
210 MICHAEL A. JONES AND JENNIFER M. WILSON
to the lexicographic kernel. These results were all based on discrete dynamical
approaches; other research has focused on differential methods (see, for example,
Kalai, Maschler and Owen [11]). More relevant to this article is the dynamic pro-
cess outlined in Maschler and Owen [12], in which the authors define a generalized
Shapley value on the set of hypergames, and then show that the solution can be
obtained as the limit of an sequence of allocations. Given a Pareto optimal initial
allocation, each subsequent allocation is inductively defined based on the Shapley
value of a set of reduced hypergames, whose values are dependent on the previous
allocation. They proved that for a wide range of initial allocations, the dynamic
process converges to the correct value.
Dynamic frameworks have also been explored in other fair division problems.
Moreno-Tornero [14] looked at the dynamics of voting among taxation rules, and
showed that within the generalized TAL family of piecewise linear taxation rules,
there is a unique taxation rule that is approved by a majority of voters. Fleurbaey
and Roemer [6] defined a dynamic approach to bargaining, in which a sequence of
bargaining solutions is chosen to minimize penalties derived from violating certain
desired axioms, and prove that the sequence converges almost surely to the Nash
bargaining solution under a wide range of penalties. Finally, Hougaard, Moreno-
Ternero, and Østerdal [9] defined a mechanism to generalize bankruptcy rules to
bankruptcy problems with baselines, and propose an interpretation in which a
sequence of baselines is chosen in a dynamic process to capture a series of allocations
over a discrete number of time periods.
Aumann and Maschler [2] introduced the Talmud rule, which extends the so-
lutions given in Table 1 to any n-player bankruptcy problem. When the estate is
small relative to the sum of the claims (i.e., E ≤ D/2), then the Talmud rule applies
the Constrained Equal Awards rule to the claims vector d/2. When the estate is
large enough (i.e., E > D/2), then the Talmud rule applies the Constrained Equal
Losses rule on E − D/2 for the claims vector d/2 after allocating each player half of
its claim. Introduced by Chun et al. [3], the Reverse Talmud rule does the reverse,
for E ≤ D/2: using the Constrained Equal Losses rule, and for E > D/2: using
the Constrained Equal Awards rule on E − D/2 after allocating each player half of
its claim.
There are a number of generalizations of the Talmud rule. One of these is the
set of TAL rules introduced by Moreno-Ternero and Villar [15]. The set of TAL
rules is a parametrized class of bankruptcy rules for which the Talmud rule, the
Constrained Equal Awards rule and the Constrained Equal Losses rule are special
cases. For a fixed θ ∈ [0, 1], the TAL rule with parameter θ uses the Constrained
Equal Awards rule for E ≤ θD and the claims vector θd and the Constrained Equal
Losses rule on E − θD for the claims vector (1 − θ)d, after giving each player θ of
its claim, for E > θD. (TAL derives its acronym from T for Talmud, A for awards,
and L for losses.) The TAL rule with parameter θ can be written as
θ EAi (θd, E) if E < θD
Ti (d, E) =
θdi + ELi ((1 − θ)d, E − θD) if θD ≤ E ≤ D.
When θ = 1 (resp. θ = 0), the TAL rule corresponds to the Constrained Equal
Awards rule (resp. the Constrained Equal Losses rule). The Talmud rule is the
TAL rule for θ = 1/2. As discussed previously, the 2-player Talmud rule appeared
in the Talmud and is referred to as the Contested Garment rule. The following
defines the 2-player TAL rule for any θ (and d1 ≤ d2 ) by
⎧
⎨ (E/2, E/2) if E < 2θd1
T θ ((d1 , d2 ), E) = (θd1 , E − θd1 ) if 2θd1 ≤ E < (2θ − 1)d1 + d2
⎩ E+d1 −d2 E+d2 −d1
( 2 , 2 ) if (2θ − 1)d1 + d2 ≤ E ≤ d1 + d2 .
Aumann and Maschler [2] used induction to show how the Talmud rule can be
calculated through a series of steps in which player 1 (owed the least amount) is
awarded an amount based on the 2-player rule between player 1 and the coalition
of players {2, . . . , n}, and then the player with the next least claim is awarded an
amount based on the 2-player rule between player 2 and the coalition of players
{3, . . . , n}, and so on. As described below, Moreno-Ternero [13] generalized this
induction process for the TAL rule T θ , calling it the θ-coalitional procedure. As-
suming that the (n − 1)-player T θ is known, then the n-player solution can be
divided into one of three cases.
(1) If E ≤ nθd1 , then assign equal awards to all players.
(2) If nθd1 < E < D − n(1 − θ)d1 , then divide E between player 1 and
C = {2, . . . , n} using the TAL rule T θ to solve the 2-player problem
((d1 , d2 + · · · + dn ), E), using the (n − 1)-player rule (which is assumed to
be known by the induction hypothesis) to divide the amount allocated to
coalition C between its members.
(3) If E ≥ D − n(1 − θ)d1 , then assign equal losses to all players.
The three cases can be visualized on a number line representing different estate
sizes 0 ≤ E ≤ D (see Figure 1). In Section 5, this coalitional approach to TAL
THE DYNAMICS OF CONSISTENT BANKRUPTCY RULES 213
0 nθd1 D − n (1 − θ)d1 D
Suppose that some subset of players take their allotted amounts and the remain-
players (which form a subset S ⊂ N ) pool their allotments, which sum to
ing
i∈S Ci (d, E). If these players re-allocate the pooled sum among the players in S
using the same consistent rule C, then each player i would receive the same amount
Ci (d, E). And, this holds for every bankruptcy problem and for all subsets S.
The following notation is needed to provide a more formal definition of con-
sistency. For a subset S ⊆ N , d restricted to S is d|S , the |S|-dimensional vector
(di1 , di2 , . . . , di|S| ) where ik ∈ S and i1 < i2 < · · · < i|S| . For an n-dimensional
vector x, define x(S) = i∈S xi to be the sum of the xi ’s for i ∈ S.
Definition 3.1. For any bankruptcy problem (N, d, E), a bankruptcy rule C
is consistent if C(d, E) = x then, for all S ⊆ N and for all j ∈ N ,
Cj (d|S , x(S)) = Cj (d, E).
Although the previously defined rules are all consistent, not all bankruptcy
rules are. The Minimal Overlap rule, originally introduced by O’Neill [16], fails to
satisfy consistency. Others have worked to characterize and to more easily compute
the Minimal Overlap rule. For example, Alcalde et al. [1] showed that the Minimal
Overlap rule is a composition of Ibn Ezra’s rule (originally considered by O’Neill
[16]) and the Constrained Equal Losses rule. They also provided an axiomatic
characterization of the rule. We consider an example and then give a more formal
way to determine a general allocation under the Minimal Overlap rule.
Example 3.2. [An inconsistent bankruptcy rule, part 1] Let d = (100, 200, 300)
and E = 300, as in the third column of Table 1. We apply the rule presented by
Ibn Ezra, as explained by O’Neill in [16]. (For this example, Ibn Ezra’s method
coincides with the Minimal Overlap rule.) As Ibn Ezra conceived it, each player
“lays claim” to a specific portion of the estate, as visualized by the overlapping
line segments on the interval E (see Figure 2). Player 3 lays claim to the entire
interval or estate, while player 2 lays claim to 2/3 of the estate and player 1 lays
claim to 1/3. As shown in the figure, the interval [0, 100] is claimed by all players,
so each receives 1/3 of the interval corresponding to an equal amount of 100/3.
Only players 2 and 3 claim [100, 200], so they divide it evenly, receiving 100/2
each. Finally, player 3 receives the remaining interval [200, 300], worth 100. It
follows that the allocation under Ibn Ezra’s method (or the Minimal Overlap rule)
is M (d, E) = (100/3, 250/3, 550/3).
The Minimal Overlap rule specifies that each player receive an equal share of all
intervals to which the player lays claim (after distributing the intervals to minimize
the overlap). Each player’s allocation is equal to the sum of his or her shares. The
Minimal Overlap rule differs from Ibn Ezra’s method when no player i has a claim
di ≥ E. It also specifies that if a player i has di > E, then the claim is reduced to
THE DYNAMICS OF CONSISTENT BANKRUPTCY RULES 215
... .
.................................................................................................................................................................................................................
E = 300 .. ..
0 300
... . . .
.....................................................................................................................................................................................................................
d 3 = 300 . .. .
..
.
..
0 .. .. 300
.. ..
. . ..
.................................................................................................................................................
d 2 = 200 .. .. ..
..
0 .. 200
... ....
d 1 = 100 .......................................................................
. ..
0 100
.. ..
E = 800 3 ...........................................................................................................................................................................................
. .
0 800 3
. . .
.............................................................................................................................................................................................
d 2 = 300 .. .. ..
0 ..
.. 800 3
.. .
.
.............................................................................................................................................
d 1 = 200 .. ..
0 200
the entire estate E. These assumptions are described in the following proposition
that simplifies the Minimal Overlap rule.
Proposition 3.3 (Chun and Thomson [4]). Up to relabeling parts of the
amount available, there is a unique arrangement of claims achieving minimal over-
lap. It is obtained as follows:
(1) If there is some player j such that dj ≥ E, then each player i ∈ N such
that di ≥ E claims [0, E] and each other player k claims [0, dk ].
(2) If E ≥ di for each player i, then there is a unique t ∈ [0, E] such that:
(a) each player i ∈ N such that di ≥ t claims [0, t] as well as part of
[t, E] of size di − t with no overlap between claims; and
(b) each player k such that t > dk claims [0, dk ].
Example 3.4 (An inconsistent bankruptcy rule, part 2). We use Proposition
3.3 to show that the Minimal Overlap rule is not consistent. Consider the bank-
ruptcy problem in Example 3.2, and suppose that players 2 and 3 are to divide
800/3 = 250/3 + 550/3 (the sum of the amounts awarded to players 2 and 3 under
the Minimal Overlap rule for the problem (d, E)). Player 3’s claim of 300 exceeds
800/3 (see Figure 3). Hence player 3 claims the entire interval [0, 800/3] while
player 2 claims [0, 200]. Player 2 receives 200/2 while player 3 receives 200/2 plus
all of the remaining estate 800/3−200 = 200/3 for a total of 200/2+200/3 = 500/3.
Since these amounts differ from the amounts players 2 and 3 received in Example
3.2, this demonstrates that the Minimal Overlap rule is not consistent.
In the following example, we compute the Minimal Overlap allocation for the
bankruptcy problem in Example 3.1. This provides an instance in which no player
lays claim to the entire estate.
216 MICHAEL A. JONES AND JENNIFER M. WILSON
.. . ..
.....................................................................................................................................................................................................................................................................
E = 700 . ... .
..
0 .. 700
. . .
.........................................................................................................................................................
d 4 = 400 .. .. ..
..
0 ..
.
400
. . ...............................................................................
............................................
d 3 = 300 . ...
... ...
0 ..
.. 400 600
. . .. .
........................................... ..........................................
d 2 = 200 .. ... .. ..
..
0 ..
.
600 700
.. .
.........................................
d 1 = 100 .. ..
0 100
Example 3.5. For d = (100, 200, 300, 400) and E = 700, the Minimal Overlap
rule allocation is (100/4, 100/4+100, 100/4+200, 100/4+300) = (25, 125, 225, 325).
For this problem, t = 100, as t satisfies
t + (100 − t) + (200 − t) + (300 − t) + (400 − t) = 700.
The resulting minimally overlapping intervals appear in Figure 4. Each player
evenly divides [0, t] with players 2-4 each receiving uncontested intervals, too.
where Ri (d|S , xm (S)) is player i’s allocation when the bankruptcy rule R is applied
to the players in S according to their initial claims and with estate size xm (S) =
m
j∈S xj .
A bankruptcy rule requires that D = d1 + · · · + dn ≥ E. However, so that the
k-averaging dynamic process is defined on the entire n-simplex, R must be extended
for situations in which xm (S) > d(S). To do so, we extend R so that each player
is allotted his or her claim and the excess is divided among the players in a fixed
way which may depend on the original rule.
Proposition 4.1. The consistent solution C(d, E) for a consistent bankruptcy
rule C is a fixed point of the k-averaging dynamic.
Proof. Let the n-player bankruptcy problem be defined by d and E. The
solution to the bankruptcy problem under a consistent rule C is C(d, E). The
THE DYNAMICS OF CONSISTENT BANKRUPTCY RULES 217
Since C is consistent, Ci (d|S , C(d, E)(S)) = Ci (d, E) for all players i ∈ S. Thus,
substituting xm = C(d, E) into the equation is enough to show that the n-player
allocation under C is a fixed point of the dynamic process:
⎛ ⎞
1 ⎜ ⎟
FC (C(d, E)) = n−1 ⎝ C1 (d|S , C(d, E)), . . . , Cn (d|S , C(d, E))⎠
k−1 1∈S n∈S
S∈Pk S∈Pk
⎛ ⎞
1 ⎜ ⎟
= n−1 ⎝ C1 (d, E), . . . , Cn (d, E)⎠
k−1 1∈S n∈S
S∈Pk S∈Pk
= C(d, E).
The following example demonstrates that the k-averaging dynamics for an in-
consistent rule may not have the fixed point property.
Example 4.2 (Example 3.2 continued). Recall that for the Minimal Overlap
rule, if d = (100, 200, 300) and E = 300, then M (d, E) = (100/3, 250/3, 550/3).
Consider k = 2, so that the dynamics average over all pairs. From Example
3.2, the Minimal Overlap rule is not consistent and if players 2 and 3 re-allocate
250/3 + 550/3 = 800/3 using the Minimal Overlap rule, then the players receive
100 and 500/3, respectively. This is one-third of the calculations for the pairwise
averaging dynamics. The other two calculations involve allocating 100/3 + 250/3
among players 1 and 2 and allocating 100/3 + 550/3 among players 1 and 3. These
allocations are 50 and 200/3 to players 1 and 2, respectively, and 50 and 500/3
to players 1 and 3, respectively. It follows that the allocation under the Minimal
Overlap rule is not a fixed point of the pairwise averaging dynamics because
1
FM (M (d, E)) = [(0, 300/3, 500/3) + (150/3, 200/3, 0) + (150/3, 0, 500/3)]
2
= (150/3, 250/3, 550/3)
= (100/3, 250/3, 550/3) = M (d, E).
The motivation for studying the k-averaging dynamic process is to see if the
n-player allocation of a consistent rule can be viewed as an outcome of an evo-
lutionary process. The above proposition indicates that a consistent bankruptcy
rule’s solution is a fixed point of the dynamic process and, hence, once reached
is stable. The next question is whether or not the n-player allocation is also an
attractive fixed point, meaning that all initial allocations in the simplex converge
to the allocation under the dynamic process defined by the consistent rule. We
motivate a more general analysis by first considering the Proportional rule.
Under the k-player proportional rule, player i receives Pi (d|S , xm (S)) =
di m
d(S) x (S) at round m + 1 where S is a k-player set containing i. This rule is
still well-defined in the case xm (S) exceeds d(S).
218 MICHAEL A. JONES AND JENNIFER M. WILSON
Proposition 4.3. P (d, E) is the unique attractive fixed point for the k-aver-
aging dynamic process FPk .
Proof. The k-averaging dynamic process FPk can be defined by matrix multi-
plication so that FPk (xm ) is equal to
⎛ ⎞
1 ⎜ ⎟
n−1 ⎝ P1 (d|S , xm (S)), P2 (d|S , xm (S)), . . . , Pn (d|S , xm (S))⎠
k−1 1∈S 2∈S n∈S
S∈Pk S∈Pk S∈Pk
⎡ ⎤
d1 d1 d1 d1
···
⎢ d(S) ⎥
⎢ 1∈S d(S) 1,2∈S
d(S)
1,3∈S
d(S)
1,n∈S ⎥
⎢ S∈Pk ⎥
⎢ d S∈Pk
S∈Pk
d2 ⎥
S∈Pk
⎢ 2 d2 d2 ⎥
⎢ ··· ⎥
⎢ d(S) d(S) d(S) d(S) ⎥
⎢ 1,2∈S 2∈S 2,3∈S 2,n∈S ⎥
⎢ S∈Pk S∈Pk S∈Pk S∈Pk ⎥
1 ⎢ d3 d3 ⎥ m
= n−1 ⎢
⎢
d3 d3
··· ⎥x
⎥
k−1 ⎢ 1,3∈S d(S) d(S) d(S) d(S) ⎥
⎢ 2,3∈S 3∈S 3,n∈S ⎥
⎢ S∈Pk S∈Pk S∈Pk S∈Pk ⎥
⎢ .. .. .. ⎥
⎢ .. ⎥
⎢ . . . . ⎥
⎢ dn dn dn dn ⎥
⎢ ··· ⎥
⎣ d(S) d(S) d(S) d(S) ⎦
1,n∈S 2,n∈S 3,n∈S n∈S
S∈Pk S∈Pk S∈Pk S∈Pk
1
= n−1 M xm .
k−1
Because each entry of the matrix M/ n−1 is positive and the columns of M/ n−1
k−1 k−1
sum to 1, M/ n−1 k−1 is a (right) stochastic matrix. By Perron-Frobenius theory,
n−1
M/ k−1 has a unique largest (in modulus) eigenvalue of 1. The associated eigen-
vector is the unique fixed point of the map FPk ; by Proposition 4.1, the eigenvector
is P (d, E).
The eigenvector
P (d, E) is an attractive fixed point because repeated multipli-
cation by M/ n−1 k−1 can be written in terms of its (possibly complex) eigenvalues
λ1 = 1, λ2 , . . . , λn and their associated eigenvectors v1 = P (d, E), v2 , . . . , vn where
λ1 > |λ2 | ≥ |λ3 | ≥ · · · ≥ |λn | > 0. Because limm→∞ λm i = 0 for i
= 1, it
follows that limm→∞ (FPk )m (x0 ) = FPk (d, E), converging to the dominant eigen-
value/eigenvector pair, for any x0 in the n-simplex.
Moving from the Proportional rule, we now consider convergence for more
general, consistent rules. The following proposition and theorem are adapted from
Dagan and Volij [5], who consider a pairwise averaging dynamic where k = 2.
These results indicate that the n-player solution for a consistent bankruptcy rule
is the unique, attractive fixed point of the k-averaging process, as long as the rule
is monotonic in the estate size. Formally, a bankruptcy rule R is monotonic in the
estate size if E ≤ E implies Ri (d, E) ≤ Ri (d, E ) for all players i. This relatively
mild condition is met by all the bankruptcy rules discussed up to this point. In
particular, it is true for the Proportional rule and the set of TAL rules. Because
we want the dynamic process to be defined on the entire simplex, there may be
instances in which xm (S) exceeds d(S); these are cases in which the bankruptcy
THE DYNAMICS OF CONSISTENT BANKRUPTCY RULES 219
rule is not defined. The convergence results are still applicable as long as the rule
is extended to these situations in such a way that the rule is still monotonic in the
estate size. One way to handle this is to award 1/|S| of the excess to each player
in S.
Proposition 4.4. If C is a monotonic bankruptcy rule, then given any two
x
= y , |x − ym+1 | < |xm − ym | for all m, where |z| is defined
m m m+1
allocations,
by |z| = i |zi |.
Proof. Note that
1
xm+1
i − yim+1 = n−1 [Ci (d|S , xm (S)) − Ci (d|S , ym (S))].
i i k−1 i∈S
S∈Pk
It follows that
1
|xm+1 − ym+1 | ≤ n−1 |Ci (d|S , xm (S)) − Ci (d|S , ym (S))|
k−1 i i∈S
S∈Pk
1
≤ n−1 |Ci (d|S , xm (S)) − Ci (d|S , ym (S))|.
k−1 S∈Pk i∈S
But for fixed S, the relative sizes of the estates xm (S) and ym (S) are fixed, so
by monotonicity, the signs of Ci (d|S , xm (S)) − Ci (d|S , ym (S)) are the same for all
i ∈ S. Thus the absolute value signs can be moved outside the sum, giving
1
|xm+1 − ym+1 | ≤ n−1 | Ci (d|S\i , xm (S \ i)) − Ci (d|S\i , xm (S \ i))|
k−1 S∈Pk i∈S
1
= n−1 |xm (S) − ym (S)|
k−1 S∈Pk
1
≤ n−1 |xm
i − yi | =
m
|xm
i − yi |
m
< |(xm
i − yi )| + |(xj − yj )| + |x (S \ i, j) − y (S \ i, j)|
m m m m m
≤ |xm
i − yi |.
m
i∈S
A consequence of Proposition 4.4 is that for any given bankruptcy problem, FCk
has a unique fixed point, C(d, E). Further, the fixed point is attractive, as proved
below.
Theorem 4.5. C(d, E) is the unique attractive fixed point for the k-averaging
dynamic process FCk .
220 MICHAEL A. JONES AND JENNIFER M. WILSON
0.. nθd1 ..
D − n (1 − θ)d1 ..
D ..
..................................................................................................................................................................................................................................................................................................................................................................
. .... . . .... . .
... .
.... ... .
I 1 .... ...
. I 2 ... .
... .
.
I
.. ..
3
.... .. ..
....
.... .. ..
.. ..
.. ..
D
....
nθd........1...
. .... α
...
δ ...
.. ..
−n(1 − θ )d1
.. ..
.. .. .
.. .
.................................................................................................................................................................................................................................................................................................................................................................
. .... . . ... .... .
2 .... .... 2 .... 2
I 1... .... I 2
....
....
.... 3
I
. ....
.... ....
....
.... ....
.... ....
α ... .... β γ ....
.... δ
...... . .... ... .... ...
............................................................................................................................................................................................................................................................................................................................................................
. . . .
I13 I23 I33
where
α = a1θd 1 + ( n −a1)θd i2
β = a1θd 1 + a2 θd i 2 + (n − a1 − a2)θd i 3
γ = D − [a1 (1 − θ )d1 + a 2 (1 − θ )di 2 + (n − a1 − a2)(1 − θ)di 3 ]
δ = D − [a1 (1 − θ )d1 + (n − a1)(1 − θ)di 2 ]
The relationship between Figure 5 and the θ-coalitional procedure is most easily
described when the claims are distinct and Ai = i for i = 1, . . . , n. Using the num-
bering system from the earlier discussion of this procedure, the top-most picture
corresponds to Step 1 in which E is allocated between player 1 and the coalition
{2, . . . , n}; the intervals I1 , I2 and I3 correspond to Cases (1), (2) and (3) respec-
tively. In I1 players share equal awards; in I3 players share equal losses; in I2 player
1 receives θd1 and the coalition {2, . . . , n} receives the remainder. The second pic-
ture corresponds to Step 2 in which the total allocation for players {2, . . . , n} in
Step 1 is reallocated between players 2 and the coalition {3, . . . , n}; again, the in-
tervals I12 , I22 and I32 correspond to Cases (1), (2) and (3). In I12 all players in the
coalition share equal awards; in I32 all players in the coalition share equal losses;
and in I22 player 2 receives θd2 and the coalition {3, . . . , n} receives the remainder.
The third picture corresponds to Step 3, and so on. If the claims are not distinct
then the top-most picture applies to the first a1 steps, in which the allocation of
all players l ∈ A1 is determined. In I1 all players share equal awards; in I3 all
players share equal losses; in I2 , all players l ∈ A1 receive θd1 and the coalition of
all players p ∈ / A1 receives the remainder.
Returning to the question of finite versus asymptotic convergence, suppose the
claims are distinct. We claim that if E ∈ I2 , then the allocations xm 1 converge
222 MICHAEL A. JONES AND JENNIFER M. WILSON
Before beginning the proof of the theorem, we note the following. Suppose that
di ≤ dj and xM i ≤ xj for some M . Then it is easy to show that xi
M M +1
≤ xMj
+1
,
and hence xi ≤ xj for all m ≥ M . So as the iteration proceeds, if the ordering
m m
of the players’ allocation change, it only changes in such a way as to decrease the
“lexicographic” order of the indices. (For instance if xm 1 ≤ x2 ≤ x4 ≤ x3 then in
m m m
the next iteration, either the ordering stays the same or x1 m+1
≤ x2m+1
≤ xm+1
3 ≤
x4 .) Similar remarks apply to the di − xi ’s.
m+1 m
Thus, whatever the order of the initial allocation, there exists an M such that
the order of both the xm i ’s (the “wins”) and the di − xi ’s (the “losses”) is fixed for
m
all m ≥ M . Denote this order by xi1 ≤ xi2 ≤ · · · ≤ xin and dj1 − xm j1 ≤ dj2 − xj2 ≤
m
· · · ≤ djn − xjn for all m ≥ M where the ik and are chosen inductively in the
m
dk − xm
k and xj ≥ θdj then T (j, k, m) ≤ xj .
m m
Lemma 5.5. (a) Suppose that xM q ≤ θd1 − δ for some q, M and δ > 0. If
xM
p ≥ θd1 for some player p, then T (p, q, M ) ≤ xM p − δ/2. (b) If xq = θd1 and
M
xM
p ≤ θd1 − λ for some λ > 0, then T (p, q, M ) ≥ xp + λ/2. (c) If xq ≥ θd1 + δ for
M M
some δ > 0 and xM p ≤ θd1 + λ for some λ ≤ β = min{δ/2, θ(di2 − d1 )} and player
p∈/ A1 , then T (p, q, M ) ≥ xM
p + β/2.
Moving to the proof of the Theorem, we note that the intent of the following
propositions is similar: to show that, except when E lies on a boundary of one of
the intervals listed in Figure 5, after a finite number of iterations, the rule that is
applied to determine the value of T (i, j, m) for each pair of players i and j remains
constant. That is, there exists an M such that for each i, j with di ≤ dj , either
i + xj )/2 for all m ≥ M , T (i, j, m) = xi + xj − θdi for all m ≥ M
T (i, j, m) = (xm m m m
Stage 1
Proposition 5.6. If E < nθd1 then xm
i converges to E/n for all players i.
respectively. Hence
1
xM +1
≤ [ j1 + (xj1 − δ/2)] = xj1 − δ/2(n − 1)
xM M M
j1
n−1
p=j1 ,i1
= dj1 − (1 − θd1 ) + λ − δ/2(n − 1).
or dj1 − xM j1
+1
≥ (1 − θ)d1 − λ + δ/2(n − 1). Since the quantity δ/2(n − 1) is fixed, we
see that by repeating the process, we can find an M1 such that dj1 −xM j1 ≥ (1−θ)d1 ,
1
and hence dp − xp ≥ (1 − θ)d1 for all players p. It is easy to show that this remains
M
then T (p, l, m) ≤ xp . m
Step 3: By repeating the argument in Steps 1 and 2, we can show that for
each k ≤ r there exists an Mk ≥ Mk−1 such that xm p ≤ dp − (1 − θ)dik for all players
p∈ / (A1 ∪ · · · ∪ Ak−1 ) and m ≥ Mk . (This implies xm l ≤ θdik for all l ∈ Ak and
m ≥ Mk .) Thus, we obtain finally, an Mr−1 such that xm p ≤ dr − (1 − θ)dr−1 for
all p ∈ Ar and xm l ≤ θdl for all l ∈ A k , k ≤ r − 1, and for all m ≥ Mr−1 .
Step 4: We claim there exists am M ≥ Mr−1 such that xm p ≤ θd1 for all
players p and m ≥ M . Suppose that xin > θd1 for some M ≥ Mr−1 , and let
M
xMin = θd1 + λ. By Lemma 5.5, T (in , i1 , M ) ≤ xin − δ/2. Now consider the value of
M
xMin . If din > dp then xin + xp ≤ din + (2θ − 1)dp , and so either T (in , p, M ) =
M M
1
xM +1
≤ [(n − 2)xM
in + (xin − δ/2)] ≤ xin − δ/2(n − 1) = θd1 + λ − δ/2(n − 1).
M M
in
n−1
Since δ/2(n − 1) is a fixed quantity, we see that by repeating the process, we can
find an M that satisfies the claim.
Step 5: By Step 4, xm p ≤ θd1 and hence T (p, q, m) = (xp + xq )/2 for all
m m
players p, q and m ≥ M . Thus
1 xm p + xq
m
1 1
xm+1 = [ ] = xmp + xm
q .
p
n−1 2 2 2(n − 1)
q=p q=p
Converting these n equations into matrix format we see that the coefficient matrix
forms a stochastic matrix, like the one in Proposition 4.3 for the Proportional rule.
It has a unique attractive fixed point at xp = E/n for all players p.
Proposition 5.7. Suppose that E > p [dp − (1 − θ)d1 ]. Then xm p converges
to dp − (1 − θ)d1 + [E − p (dp − (1 − θ)d1 )]/n for all players p.
Proof. The proof is analogous to that of Proposition 5.6.
Proposition 5.8. Suppose that nθd1 ≤ E ≤ p [dp − (1 − θ)d1 ]. Then for
every > 0 there exists an M such that θd1 − ≤ xm
p ≤ dp − (1 − θ)d1 + for all
m ≥ M and players p.
Proof. Note that E ≥ nθd1 implies xm in ≥ θd1 for all m.
Step 1: We claim for every > 0 there exists an M1 such that xm p ≥ θd1 −
for all players p and m ≥ M1 . Suppose that xM i1 < θd1 for some M and let xM
i1 =
θd1 − λ for some λ > 0. By Lemmas 5.4 and 5.5, T (i1 , in , M ) ≥ xi1 + λ/2 and
M
T (i1 , p, M ) ≥ xM
i1 for all players p
= i1 or in . Hence
1 1 1
xM +1
≥ [(n − 1)xM M
i1 + λ/2] = xi1 + λ = θd1 − [1 − ]λ.
i1
n−1 2(n − 1) 2(n − 1)
By repeating the argument, we can show that xMi1
+2
≥ θd1 − c2 λ and in general
that xi1 ≥ θd1 − c λ where c = 1 − 2(n−1) < 1. This proves the claim.
M +t t 1
Step 2: A similar argument can be made to show that for every > 0 there
exists an M2 ≥ M1 such that xm p ≤ dp − (1 − θ)d1 + for all players p and
m ≥ M2 .
THE DYNAMICS OF CONSISTENT BANKRUPTCY RULES 225
for all players p. Since was arbitrary, xm p converges to θd1 for every player p.
(b) The proof is analogous to (a).
(c) Let δ > 0 be such that E = n(θd1 + δ), so xm in ≥ θd1 + δ for every m. We
claim that there exists an M such that xm l + x m
p ≥ 2θd 1 for all l ∈ A1 , p ∈/ A1 and
m ≥ M . Note that this is trivial if θ = 0, so assume θ > 0. By the Corollary, we
can pick M such that θd1 − ≤ xm l < θd1 + for all l ∈ A1 and m ≥ M , where > 0
is chosen below. It is easy to show that this implies xm p − 2 ≤ T (p, l, m) < xp + 2
m
and p ∈
/ A1 , for sufficiently large m.
Stage 2
Proposition 5.11. Suppose nθd1 < E < p [dp −(1−θ)d1 ] and a1 = |A1 |. (a)
If nθd1 < E < a1 θd1 + (n − a1 )θdi2 then m
xp converges to θd1 + (E − nθd1 )/(n − a1 )
for all players p ∈
/ A1 . (b) If a1 θd1 + p∈A
/ 1
[dp −(1−θ)d i2 ] < E < p [dp −(1−θ)d1 ]
then xp converges to dp − δ where δ = [ p∈A
m
/ 1 dp − (E − a1 θd1 )]/(n − a1 ).
(n − a1 )(θdi2 − δ) for
Proof. We outline the proof of (a). Let E = a1 θd1 +
some δ > 0. By Proposition 5.10, we can pick M such that l∈A1 |xm l − θd1 | < δ
226 MICHAEL A. JONES AND JENNIFER M. WILSON
j + xk + dj − dk )/2 ≤ xj .
T (j, k, m) = (xm
m m
Acknowledgments
We thank an anonymous, knowledgeable referee for providing helpful and in-
sightful comments.
References
[1] José Alcalde, Marı́a del Carmen Marco, and José A. Silva, The minimal overlap rule revisited,
Soc. Choice Welf. 31 (2008), no. 1, 109–128, DOI 10.1007/s00355-007-0269-5. MR2403232
(2009e:91144)
228 MICHAEL A. JONES AND JENNIFER M. WILSON
[2] Robert J. Aumann and Michael Maschler, Game theoretic analysis of a bankruptcy prob-
lem from the Talmud, J. Econom. Theory 36 (1985), no. 2, 195–213, DOI 10.1016/0022-
0531(85)90102-4. MR804893 (87b:90152)
[3] Youngsub Chun, James Schummer, and William Thomson, Constrained egalitarianism: a
new solution to bankruptcy problems, Seoul J of Econ. 14 (2001), no. 3, 269–297.
[4] Youngsub Chun and William Thomson, Convergence under replication of rules to ad-
judicate conflicting claims, Games Econom. Behav. 50 (2005), no. 2, 129–142, DOI
10.1016/j.geb.2004.01.006. MR2110665 (2005h:91135)
[5] Nir Dagan and Oscar Volij, Bilateral comparisons and consistent fair division rules in the
context of bankruptcy problems, Internat. J. Game Theory 26 (1997), no. 1, 11–25, DOI
10.1007/BF01262509. MR1437934
[6] Marc Fleurbaey and John E. Roemer, Judicial precedent as a dynamic rationale for axiomatic
bargaining theory, Theor. Econ. 6 (2011), no. 2, 289–310, DOI 10.3982/TE588. MR2945118
[7] Toru Hokari and William Thomson, Claims problems and weighted generalizations of the
Talmud rule, Econom. Theory 21 (2003), no. 2-3, 241–261, DOI 10.1007/s00199-002-0314-7.
Symposium in Honor of Mordecai Kurz (Stanford, CA, 2002). MR2013209 (2004i:91022)
[8] Toru Hokari and William Thomson, On properties of division rules lifted by bilateral consis-
tency, J. Math. Econom. 44 (2008), no. 11, 1057–1071, DOI 10.1016/j.jmateco.2008.01.001.
MR2456467 (2009j:91126)
[9] Jens Leth Hougaard, Juan D. Moreno-Ternero, and Lars Peter Østerdal, Rationing in the
presence of baselines, Soc. Choice Welf. 40 (2013), no. 4, 1047–1066, DOI 10.1007/s00355-
012-0664-4. MR3046850
[10] M. Justman, Iterative processes with “nucleolar” restrictions, Internat. J. Game Theory 6
(1977), no. 4, 189–212. MR0503918 (58 #20530)
[11] G. Kalai, M. Maschler, and G. Owen, Asymptotic stability and other properties of trajectories
and transfer sequences leading to the bargaining sets, Internat. J. Game Theory 4 (1975),
no. 4, 193–213. MR0416623 (54 #4693)
[12] M. Maschler and G. Owen, The consistent Shapley value for hyperplane games, Internat. J.
Game Theory 18 (1989), no. 4, 389–407, DOI 10.1007/BF01358800. MR1030404 (90m:90322)
[13] Juan D. Moreno-Ternero, A coalitional procedure leading to a family of bankruptcy rules,
Oper. Res. Lett. 39 (2011), no. 1, 1–3, DOI 10.1016/j.orl.2010.10.001. MR2748705
(2012g:91089)
[14] Juan D. Moreno-Ternero, Voting over piece-wise linear tax methods, J. Math. Econom. 47
(2011), no. 1, 29–36, DOI 10.1016/j.jmateco.2010.11.002. MR2780794 (2012c:91150)
[15] Juan D. Moreno-Ternero and Antonio Villar, The TAL-family of rules for bankruptcy
problems, Soc. Choice Welf. 27 (2006), no. 2, 231–249, DOI 10.1007/s00355-006-0121-3.
MR2257353 (2007k:91178)
[16] Barry O’Neill, A problem of rights arbitration from the Talmud, Math. Social Sci. 2 (1982),
no. 4, 345–371, DOI 10.1016/0165-4896(82)90029-4. MR662176 (83m:90006)
[17] David Schmeidler, The nucleolus of a characteristic function game, SIAM J. Appl. Math. 17
(1969), 1163–1170. MR0260432 (41 #5058)
[18] R. E. Stearns, Convergent transfer schemes for N -person games, Trans. Amer. Math. Soc.
134 (1968), 449–459. MR0230550 (37 #6112)
[19] William Thomson, Axiomatic and game-theoretic analysis of bankruptcy and taxation
problems: a survey, Math. Social Sci. 45 (2003), no. 3, 249–297, DOI 10.1016/S0165-
4896(02)00070-7. MR1979085 (2004c:91003)
[20] William Thomson, Two families of rules for the adjudication of conflicting claims, Soc.
Choice Welf. 31 (2008), no. 4, 667–692, DOI 10.1007/s00355-008-0302-3. MR2448979
(2009m:91115)
[21] William Thomson, Consistency and its converse: an introduction, Rev. Econ. Des. 15 (2011),
no. 4, 257–291, DOI 10.1007/s10058-011-0109-z. MR2860935 (2012k:91213)
[22] William Thomson, On the axiomatics of resource allocation: Interpreting the consistency
principle, Econ. and Phil. 28 (2013), no. 3, 385–421.
[23] Moshe Yarom, The lexicographic kernel of a cooperative game, Math. Oper. Res. 6 (1981),
no. 1, 88–100, DOI 10.1287/moor.6.1.88. MR618966 (83a:90200)
THE DYNAMICS OF CONSISTENT BANKRUPTCY RULES 229
Department of Natural Science and Mathematics, Eugene Lang College The New
School for Liberal Arts, New York, New York 10011
E-mail address: [email protected]
624 CONM
Decisions, Elections, and Games • Crisman et al., Editors
This volume contains the proceedings of two AMS Special Sessions on The Mathematics
of Decisions, Elections, and Games, held January 4, 2012, in Boston, MA, and January
11–12, 2013, in San Diego, CA.
Decision theory, voting theory, and game theory are three intertwined areas of math-
ematics that involve making optimal decisions under different contexts. Although these
areas include their own mathematical results, much of the recent research in these ar-
eas involves developing and applying new perspectives from their intersection with other
branches of mathematics, such as algebra, representation theory, combinatorics, convex
geometry, dynamical systems, etc.
The papers in this volume highlight and exploit the mathematical structure of decisions,
elections, and games to model and to analyze problems from the social sciences.
ISBN 978-0-8218-9866-6
9 780821 898666
CONM/624
AMS