Lecture Notes in Algebraic Combinatorics - Jeremy L. Martin
Lecture Notes in Algebraic Combinatorics - Jeremy L. Martin
Jeremy L. Martin
University of Kansas
[email protected]
Copyright ©2010–2020 by Jeremy L. Martin (last updated March 12, 2021). Licensed under a Creative
Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
Contents
2 Poset Algebra 33
2.1 The incidence algebra of a poset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2 The Möbius function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3 Möbius inversion and the characteristic polynomial . . . . . . . . . . . . . . . . . . . . . . . . 39
2.4 Möbius functions of lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3 Matroids 50
3.1 Closure operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.2 Matroids and geometric lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.3 Graphic matroids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.4 Matroid independence, basis and circuit systems . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.5 Representability and regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.6 Direct sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.7 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.8 Deletion and contraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5 Hyperplane Arrangements 87
5.1 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.2 Counting regions: examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.3 Zaslavsky’s Theorem(s) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.4 The finite field method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
2
5.5 Supersolvable lattices and arrangements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.6 A brief taste of arrangements over C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.7 Faces and the big face lattice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.8 Faces of the braid arrangement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.9 Oriented matroids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.9.1 Oriented matroid covectors from hyperplane arrangements . . . . . . . . . . . . . . . . 108
5.9.2 Oriented matroid circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.9.3 Oriented matroids from graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
3
9.13 What’s next . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
9.14 The Murnaghan-Nakayama Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
9.15 The Hook-Length Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
9.16 The Littlewood-Richardson Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
9.17 Knuth equivalence and jeu de taquin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
9.18 Yet another version of RSK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
9.19 Quasisymmetric functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
9.20 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
4
Foreword
These lecture notes began as my notes from Vic Reiner’s Algebraic Combinatorics course at the University
of Minnesota in Fall 2003. I currently use them for graduate courses at the University of Kansas. They
will always be a work in progress. Please use them and share them freely for any research purpose. I
have added and subtracted some material from Vic’s course to suit my tastes, but any mistakes are my
own; if you find one, please contact me at [email protected] so I can fix it. Thanks to those who have
suggested additions and pointed out errors, including but not limited to: Kevin Adams, Nitin Aggarwal,
Trevor Arrigoni, Dylan Beck, Jonah Berggren, Lucas Chaffee, Geoffrey Critzer, Mark Denker, Souvik Dey,
Joseph Doolittle, Ken Duna, Monalisa Dutta, Josh Fenton, Logan Godkin, Bennet Goeckner, Darij Grinberg
(especially!), Brent Holmes, Arturo Jaramillo, Alex Lazar, Kevin Marshall, George Nasr (especially!), Nick
Packauskas, Abraham Pascoe, Smita Praharaj, John Portin, Billy Sanders, Tony Se, and Amanda Wilkens.
Marge Bayer contributed the material on Ehrhart theory in §7.4.
5
Chapter 1
1.1 Posets
Definition 1.1.1. A partially ordered set or poset is a set P equipped with a relation ≤ that is reflexive,
antisymmetric, and transitive. That is, for all x, y, z ∈ P :
1. x ≤ x (reflexivity).
2. If x ≤ y and y ≤ x, then x = y (antisymmetry).
3. If x ≤ y and y ≤ z, then x ≤ z (transitivity).
We say that x is covered by y, written x l y, if x < y and there exists no z such that x < z < y. Two
posets P, Q are isomorphic if there is a bijection φ : P → Q that is order-preserving; that is, x ≤ y in P iff
φ(x) ≤ φ(y) in Q. A subposet of P is a subset P 0 ⊆ P equipped with the order relation given by restriction
from P .
We’ll usually assume that P is finite. Sometimes a weaker assumption suffices, such that P is chain-finite
(every chain is finite) or locally finite (every interval is finite). (We’ll say what “chains” and “intervals”
are soon.)
Definition 1.1.2. A poset L is a lattice if every pair x, y ∈ L (i) has a unique largest common lower bound,
called their meet and written x ∧ y; (ii) has a unique smallest common upper bound, called their join and
written x ∨ y. That is, for all z ∈ L,
z ≤ x and z ≤ y ⇒ z ≤ x ∧ y,
z ≥ x and z ≥ y ⇒ z ≥ x ∨ y,
6
123 123
12 12 13 23 12 13 23
1 2 1 2 3 1 2 3
∅ ∅ ∅
Bool2 Bool3
Note that 2[n] is a lattice, with meet and join given by intersection and union respectively. J
The first two pictures are Hasse diagrams: graphs whose vertices are the elements of the poset and whose
edges represent the covering relations, which are enough to generate all the relations in the poset by
transitivity. (As you can see on the right, including all the relations would make the diagram unnecessarily
complicated.) By convention, bigger elements in P are at the top of the picture.
The Boolean algebra 2S has a unique minimum element (namely ∅) and a unique maximum element (namely
S). Not every poset has to have such elements, but if a poset does, we will call them 0̂ and 1̂ respectively
(or if necessary 0̂P and 1̂P ).
Definition 1.1.4. A poset that has both a 0̂ and a 1̂ is called bounded.1 An element that covers 0̂ is
called an atom, and an element that is covered by 1̂ is called a coatom. For example, the atoms in 2S are
the singleton subsets of S, and the coatoms are the subsets of cardinality |S| − 1.
We can make a poset P bounded: define a new poset P̂ by adjoining new elements 0̂, 1̂ such that 0̂ < x < 1̂
for every x ∈ P . Meanwhile, sometimes we have a bounded poset and want to delete the bottom and top
elements.
Definition 1.1.5. Let x, y ∈ P with x ≤ y. The interval from x to y is
[x, y] := {z ∈ P : x ≤ z ≤ y}.
This formula makes sense if x 6≤ y, when [x, y] = ∅, but typically we don’t want to think of the empty set as
a bona fide interval. Also, [x, y] is a singleton set if and only if x = y.
Definition 1.1.6. A subset C ⊆ P (or P itself) is called a chain if its elements are pairwise comparable.
Thus every chain is of the form C = {x0 , . . . , xn }, where x0 < · · · < xn . The number n is called the length
of the chain; notice that the length is one less than the cardinality of the chain. The chain C is called
saturated if x0 l · · · l xn ; equivalently, C is maximal among all chains with bottom element x0 and top
element xn . (Note that not all such chains necessarily have the same length — we will get back to that
soon.) An antichain is a subset of P (or, again, P itself) in which no two of its elements are comparable.2
For example, in the Boolean algebra Bool3 , the subset3 {∅, 3, 123} is a chain of length 2 (note that it is not
saturated), while {12, 3} and {12, 13, 23} are antichains. The subset {12, 13, 3} is neither a chain nor an
antichain: 13 is comparable to 3 but not to 12.
1 This has nothing to do with the more typical metric-space definition of “bounded”.
2 To set theorists, “antichain” means something stronger: a set of elements such that no two have a common lower bound.
This concept does not typically arise in combinatorics, where one frequently wants to talk about antichains in a bounded posets.
3 It is very common to drop the braces and commas from subsets of [n], since it is easier and cleaner to write {∅, 3, 123}
7
123 123 123 123
12 13 23 12 13 23 12 13 23 12 13 23
1 2 3 1 2 3 1 2 3 1 2 3
∅ ∅ ∅ ∅
chain antichain antichain neither
One of the many nice properties of the Boolean algebra Booln is that its elements fall into horizontal slices
(sorted by their cardinalities). Whenever S l T , it is the case that |T | = |S| + 1. A poset for which we can
do this is called a ranked poset. However, it would be tautological to define a ranked poset to be a poset
in which we can rank the elements! The actual definition of rankedness is a little more subtle, but makes
perfect sense after a little thought, particularly after looking at an example of how a poset might fail to be
ranked:
1̂
z
y
x
0̂
You can see what goes wrong — the chains 0̂ l x l z l 1̂ and 0̂ l y l 1̂ have the same bottom and top and
are both saturated, but have different lengths. So the “rank” of 1̂ is not well-defined; it could be either it 2
or 3 more than the “rank” of 0̂. Saturated chains are thus a key element in defining what “ranked” means.
Definition 1.1.7. A poset P is ranked if for every x, y ∈ P , all saturated chains with bottom element x
and top element y have the same length. A poset is graded if it is ranked and bounded.
In practice, most ranked posets we will consider are graded, or at least have a bottom element. To define a
rank function r : P → Z, one can choose the rank of any single element arbitrarily, then assign the rest of
the ranks by ensuring that
x l y =⇒ r(y) = r(x) + 1; (1.1)
it is an exercise to prove that this definition results in no contradiction. It is standard to define r(0̂) = 0
so that all ranks are nonnegative; then r(x) is the length of any saturated chain from 0̂ to x. (Recall from
Definition 1.1.6 that “length” means the number of steps, not the number of elements — i.e., edges rather
than vertices in the Hasse diagram.)
Definition 1.1.8. Let P be a ranked poset with rank function r. The rank-generating function of P is
the formal power series X
FP (q) = q r(x) .
x∈P
k
Thus, for each k, the coefficient of q is the number of elements at rank k.
8
Order ideal (generators) Order filter (generators) Interval (endpoints)
The expansion of this polynomial is palindromic, because the coefficients are a row of Pascal’s Triangle.
That is, Booln is rank-symmetric. Rank-symmetry also follows from the self-duality of Booln .
More generally, if P and Q are ranked, then P × Q is ranked, with rP ×Q (x, y) = rP (x) + rQ (y), and
FP ×Q = FP FQ .
Definition 1.1.9. A linear extension of a poset P is a total order ≺ on the set P that refines <P : that
is, if x <P y then x ≺ y. The set of all linear extensions is denoted L (P ) (and sometimes called the
Jordan-Hölder set of P ).
Colloquially, an order ideal is a subset of P “closed under going down”. Note that a subset of P is an order
ideal if and only if its complement is an order filter. The order ideal generated by Q ⊆ P is the smallest
order ideal containing it, namely hQi = {x ∈ P : x ≤ q for some q ∈ Q}. Conversely, every order ideal has
a unique minimal set of generators, namely its maximal elements (which form an antichain).
Example 1.1.11. Let {F1 , . . . , Fk } be a nonempty family of subsets of [n]. The order ideal they generate is
These order ideals are called abstract simplicial complexes, and are the standard combinatorial models
for topological spaces (at least well-behaved ones). If each Fi is regarded as a simplex (i.e., the convex hull
of a set of affinely independent points) then the order-ideal condition says that if ∆ contains a simplex, then
it contains all sub-simplices. For example, ∆ cannot contain a triangle without also containing its edges
and vertices. Simplicial complexes are the fundamental objects of topological combinatorics, and we’ll have
much more to say about them in Chapter 6. J
9
There are several ways to make new posets out of old ones. Here are some of the most basic.
Definition 1.1.12. Let P, Q be posets.
• The dual P ∗ of P is obtained by reversing all the order relations: x ≤P ∗ y iff x ≥P y. The Hasse
diagram of P ∗ is the same as that of P , turned upside down. A poset is self-dual if P ∼ = P ∗ ; the
map realizing the self-duality is called an anti-automorphism. For example, chains and antichains
are self-dual, as is Booln (via the anti-automorphism S 7→ [n] \ S).
• The disjoint union P + Q is the poset on P ∪· Q that inherits the relations from P and Q but no
others, so that elements of P are incomparable with elements of Q. The Hasse diagram of P + Q can
be obtained by drawing the Hasse diagrams of P and Q side by side.
• The Cartesian product P × Q has a poset structure as follows: (p, q) ≤ (p0 , q 0 ) if p ≤P p0 and
q ≤Q q 0 . This is a very natural and useful operation. For example, it is not hard to check that
Boolk × Bool` ∼ = Boolk+` .
• Assume that P has a 1̂ and Q has a 0̂. Then the ordinal sum P ⊕ Q is defined by identifying 1̂P = 0̂Q
and setting p ≤ q for all p ∈ P and q ∈ Q. Note that this operation is not in general commutative
(although it is associative).
P Q P ×Q P ⊕Q
1.2 Lattices
Definition 1.2.1. A poset L is a lattice if every pair x, y ∈ L has a unique meet x ∧ y and join x ∨ y.
That is,
Note that, e.g., x∧y = x if and only if x ≤ y. Meet and join are easily seen to be commutative and associative,
so for any finite M ⊆ L, the meet ∧M and join ∨M are well-defined elements of L. In particular, every
finite lattice is bounded, with 0̂ = ∧L and 1̂ = ∨L. (In an infinite lattice, the join or meet of an infinite set
of elements may not be well-defined.) For convenience, we set ∧∅ = 1̂ and ∨∅ = 0̂.
As mentioned earlier, the Boolean algebra Booln is a lattice, with meet and join given by intersection and
union respectively (note that the symbols ∧ and ∨ resemble ∩ and ∪ respectively).
10
Example 1.2.2 (The partition lattice). An [unordered] set partition of S is a set of pairwise-disjoint,
non-empty sets (“blocks”) whose union is S. It is the same data as an equivalence relation on S, whose
equivalence classes are the blocks. It is important to keep in mind that neither the blocks, nor the elements
of each block, are ordered.
Let Πn be the poset of all set partitions of [n]. For example, two elements of Π5 are
π = {1, 3, 4}, {2, 5} (abbr.: 134|25)
σ = {1, 3}, {4}, {2, 5} (abbr.: 13|4|25)
We can impose a partial order on Πn as follows: σ ≤ π if every block of σ is contained in a block of π; for
short, σ refines π (as here). To put it another way, σ can be formed by further splitting up π, or equivalently
every block of σ is a subset of some block of π.
1234
1|2|3 Π3 1|2|3|4 Π4
Observe that Πn is bounded, with 0̂ = 1|2| · · · |n and 1̂ = 12 · · · n. For each set partition σ, the partitions
that cover σ in Πn are those obtained from σ by merging two of its blocks into a single block. Therefore,
Πn is ranked (hence graded), with rank function r(π) = n − |π|. The coefficients of the rank-generating
function of Πn are by definition the Stirling numbers of the second kind. Recall that S(n, k) is the number
of partitions of [n] into k blocks, so
Xn
FΠn (q) = S(n, k)q n−k .
k=1
Furthermore, Πn is a lattice: any two set partitions π, σ have a unique coarsest common refinement
π ∧ σ = {A ∩ B : A ∈ π, B ∈ σ, A ∩ B 6= ∅}.
Meanwhile, π ∨ σ is defined as the transitive closure of the union of the equivalence relations corresponding
to π and σ.
Finally, for any finite set, we can define ΠX to be the poset of set partitions of X, ordered by reverse
refinement; evidently ΠX ∼ = Π|X| . J
Example 1.2.3 (The connectivity lattice of a graph). Let G = (V, E) be a graph. Recall that for
X ⊆ V , the induced subgraph G|X is the graph on vertex set X, with two edges adjacent in G|X if and only
if they are adjacent in G. The connectivity lattice of G is the subposet of ΠV defined by
11
1234
1 3
123|4 124|3 1|234 13|24
1|2|3|4 K(G)
For an example, see Figure 1.3. It is not hard to see that K(G) = ΠV if and only if G is the complete graph
KV , and K(G) is Boolean if and only if G is acyclic. Also, if H is a subgraph of G then K(H) is a subposet
of K(G). The proof that K(G) is in fact a lattice (justifying the terminology) is left as an exercise.
J
Example 1.2.4 (Partitions, tableaux, and Young’s lattice). An (integer) partition is a sequence
λ = (λ1 , . . . , λ` ) of weakly decreasing positive integers: i.e., λ1 ≥ · · · ≥ λ` > 0. If n = λ1 + · · · + λ` , we write
λ ` n and/or n = |λ|. For convenience, we often set λi = 0 for all i > `.
Partitions are fundamental objects that will come up in many contexts. Let Y be the set of all partitions,
partially ordered by λ ≥ µ if λi ≥ µi for all i = 1, 2, . . . . Then Y is a ranked lattice, with rank function
r(λ) = |λ|. Join and meet are given by component-wise max and min — we’ll shortly see another description
of the lattice operations. J
This is an infinite poset, but the number of partitions at any given rank is finite. So in particular Y is locally
finite (if X is any adjective, then “poset P is locally X” means “every interval in P is X”). Moreover, the
rank-generating function X XX
q |λ| = qn
λ n≥0 λ`n
is a well-defined formal power series, and it is given by the justly celebrated formula
∞
Y 1
.
1 − qk
k=1
There is a nice pictorial way to look at Young’s lattice. Instead of thinking about partitions as sequence of
numbers, view them as their corresponding Ferrers diagrams (or Young diagrams): northwest-justified
piles of boxes whose ith row contains λi boxes. The northwest-justification convention is called “English
notation”, and I will use that throughout, but a significant minority of combinatorialists prefer “French
notation”, in which the vertical axis is reversed. For example, the partition (5, 5, 4, 2) is represented by the
Ferrers diagram
(English) or (French).
Now the order relation in Young’s lattice is as follows: λ ≥ µ if and only if the Ferrers diagram of λ contains
that of µ. The bottom part of the Hasse diagram of Y looks like this:
12
In terms of Ferrers diagrams, join and meet are simply union and intersection respectively.
Young’s lattice Y has a nontrivial automorphism λ 7→ λ̃ called conjugation. This is most easily described
in terms of Ferrers diagrams: reflect across the line x + y = 0 so as to swap rows and columns. It is easy to
check that if λ ≥ µ, then λ̃ ≥ µ̃.
A maximal chain from ∅ to λ in Young’s lattice can be represented by a standard tableau: a filling of
λ with the numbers 1, 2, . . . , |λ|, using each number once, with every row increasing to the right and every
column increasing downward. The kth element in the chain is the Ferrers diagram containing the numbers
1, . . . , k. For example:
∅ l l l l l ←→ 1 2 4 .
3 5
Example 1.2.5 (Subspace lattices). Let q be a prime power, let Fq be the field of order q, and let
V = Fnq (a vector space of dimension n over Fq ). The subspace lattice LV (q) = Ln (q) is the set of all
vector subspaces of V , ordered by inclusion. (We could replace Fq with an infinite field. The resulting poset
is infinite, although chain-finite.)
The meet and join operations on Ln (q) are given by W ∧ W 0 = W ∩ W 0 and W ∨ W 0 = W + W 0 . We could
construct analogous posets by ordering the (normal) subgroups of a group, or the prime ideals of a ring, or
the submodules of a module, by inclusion. (However, these posets are not necessarily ranked, while Ln (q) is
ranked, by dimension.)
The simplest example is when q = 2 and n = 2, so that V = {(0, 0), (0, 1), (1, 0), (1, 1)}. Of course V has one
subspace of dimension 2 (itself) and one of dimension 0 (the zero space). Meanwhile, it has three subspaces
of dimension 1; each consists of the zero vector and one nonzero vector. Therefore, L2 (2) ∼ = M5 .
M5
Note that Ln (q) is self-dual, under the anti-automorphism W → W ⊥ (the orthogonal complement with
respect to any non-degenerate bilinear form). J
13
Example 1.2.6 (The lattice of ordered set partitions). An ordered set partition (OSP) of S is an ordered
list of pairwise-disjoint, non-empty sets (“blocks”) whose union is S. Note the difference from unordered
set partitions (Example 1.2.2). We use the same notation for OSPs as for their unordered cousins, but
now, for example, 14|235 and 235|14 represent different OSPs. The set On of OSPs of [n] is a poset under
refinement: σ refines π if π can be obtained from σ by removing zero or more separator bars. For example,
16|247|389|5 ≤ 16|2|4|7|38|9|5, but 1|23|45 and 12|345 are incomparable. The Hasse diagram for O3 is as
follows.
1|2|3 2|1|3 1|3|2 2|3|1 3|1|2 3|2|1
123
This poset is ranked, with rank function r(π) = |π| − 1 (i.e., the number of bars, or one less than the number
of blocks, just like Πn ). Technically On is not a lattice but only a meet-semilattice, since join is not always
well-defined. However, we can make it into a true lattice by appending an artificial 1̂ at rank n.
Note that every interval [π, σ] is a Boolean algebra, whose atoms correspond to the bars that appear in σ
but not in π.
There is a nice geometric way to picture On . Every point x = (x1 , . . . , xn ) ∈ Rn gives rise to an OSP
φ(x) that describes which coordinates are less than, equal to, or greater than others. For example, if
x = (6, 6, 0, 4, 7) ∈ R5 , then φ(x) = 3|4|12|5, since x3 < x4 < x1 = x2 < x5 . Let Cπ = φ−1 (x) ⊂ Rn ; that
is, Cπ is the set of points whose relative order of coordinates is given by π. The Cπ decompose Rn , so they
give a good picture of On . For example, the picture for n = 3 looks like this. (I am just looking at the plane
x1 + x2 + x3 = 0, which gives the full structure.)
x=z
x<z
x>z 1|3|2
123
3|1|2 1|2|3
13|2 1|23
x<y
3|12 12|3 x=y
x>y
23|1 2|31
3|2|1 2|1|3
y>z 2|3|1
y<z
y=z
14
The topology matches the combinatorics: for example, each Cπ is a |π|-dimensional space, and π ≤ σ in On
if and only if Cπ ⊆ Cσ (where the bar means closure). We’ll come back to this in more detail when we study
hyperplane arrangements in Chapter 5; see especially §5.8. J
Example 1.2.7. Lattices don’t have to be ranked. For example, the poset N5 shown below is a perfectly
good lattice.
N5
J
Proposition 1.2.8 (Absorption laws). Let L be a lattice and x, y ∈ L. Then x ∨ (x ∧ y) = x and
x ∧ (x ∨ y) = x. (Proof left to the reader.)
The following result is a very common way of proving that a poset is a lattice.
Proposition 1.2.9. Let P be a bounded poset that is a meet-semilattice (i.e., every nonempty B ⊆ P has a
well-defined meet ∧B). Then every nonempty subset of P has a well-defined join, and consequently P is a
lattice. Similarly, every bounded join-semilattice is a lattice.
Proof. Let P be a bounded meet-semilattice. Let A ⊆ P , and let B = {b ∈ P : b ≥ a for all a ∈ A}. Note
that B 6= ∅ because 1̂ ∈ B. Then ∧B is the unique least upper bound for A, for the following reasons. First,
∧B ≥ a for all a ∈ A by definition of B and of meet. Second, if x ≥ a for all a ∈ A, then x ∈ B and so
x ≥ ∧B. So every bounded meet-semilattice is a lattice, and the dual argument shows that every bounded
join-semilattice is a lattice,
This statement can be weakened slightly: any poset that has a unique top element and a well-defined meet
operation is a lattice (the bottom element comes free as the meet of the entire set), as is any poset with a
unique bottom element and a well-defined join.
Definition 1.2.10. Let L be a lattice. A sublattice of L is a subposet L0 ⊆ L that (a) is a lattice and (b)
inherits its meet and join operations from L. That is,
Note that the maximum and minimum elements of a sublattice of L need not be the same as those of L. As
an important example, every interval L0 = [x, z] ⊆ L (i.e., L0 = {y ∈ L : x ≤ y ≤ z}) is a sublattice with
minimum element x and maximum element z. (We might write 0̂L0 = x and 1̂L0 = z.)
Example 1.2.11. Young’s lattice Y is an infinite lattice. Meets of arbitrary sets are well-defined, as are
finite joins. There is a 0̂ element (the empty Ferrers diagram), but no 1̂. On the other hand, Y is locally
finite — every interval [λ, µ] ⊆ Y is finite. Similarly, the set of natural numbers, partially ordered by
divisibility, is an infinite, locally finite lattice with a 0̂. J
15
Example 1.2.12. Consider the set M = {A ⊆ [4] : A has even size}. This is a lattice, but it is not a
sublattice of Bool4 , because for example 12 ∧M 13 = ∅ while 12 ∧Bool4 13 = 1. J
Example 1.2.13. [Weak Bruhat order] Let Sn be the set of permutations of [n] (i.e., the symmetric
group).4 Write elements w ∈ Sn as strings w1 w2 · · · wn of distinct digits, e.g., 47182635 ∈ S8 . (This is
called one-line notation.) The weak Bruhat order ≤W on Sn is defined as follows: w lW v if v can be
obtained by swapping wi with wi+1 , where wi < wi+1 . For example,
In other words, v = wsi , where si is the transposition that swaps i with i + 1. The weak order actually is a
lattice, though this is not so easy to prove.
The Bruhat order ≤B on permutations is a related partial order with more relations (i.e., “stronger”) than
the weak order. he simplest way of describing Bruhat order is that w lB v if inv(v) = inv(w) + 1 and v = wt
for some transposition t. For example,
47162835 lB 47182635
in Bruhat order (because this transposition has introduced exactly one more inversion), but not in weak
order (since the positions transposed, namely 4 and 6, are not adjacent). On the other hand, 47162835 is
not covered by 47862135 because this transposition increases the inversion number by 5, not by 1.
The Bruhat and weak orders on S3 are shown below. You should be able to see from the picture that Bruhat
order is not a lattice.
321 321
123 123
A Coxeter group is a finite group generated by elements s1 , . . . , sn , called simple reflections, satisfying s2i = 1
and (si sj )mij = 1 for all i 6= j and some integers mij ≥ 2. For example, setting mij = 3 if |i − j| = 1 and
mij = 2 if |i − j| > 1, we obtain the symmetric group Sn+1 . Coxeter groups are fantastically important in
geometric combinatorics and we could spend at least a semester on them. The standard resources are the
books by Brenti and Björner [BB05], which has a more combinatorial approach, and Humphreys [Hum90],
which has a more geometric flavor. For now, it’s enough to mention that every Coxeter group has associated
Bruhat and weak orders, whose definitions generalize those for the symmetric group.
4 That’s a Fraktur S, obtainable in LaTeX as \mathfrak{S}. The letter S has many other standard uses in combinatorics:
Stirling numbers, symmetric functions, etc. The symmetric group is important enough to merit an ornate symbol!
16
The Bruhat and weak order give graded, self-dual poset structures on Sn , both ranked by number of
inversions:
r(w) = {i, j} : i < j and wi > wj .
(For a general Coxeter group, the rank of an element w is the minimum number r such that w is the product
of r simple reflections.) The rank-generating function of Sn is a very nice polynomial called the q-factorial:
n
Y 1 − qi
FSn (q) = 1(1 + q)(1 + q + q 2 ) · · · (1 + q + · · · + q n−1 ) = .
i=1
1−q
Definition 1.3.1. A lattice L is distributive if the following two equivalent conditions hold:
x ∧ (y ∨ z) = (x ∧ y) ∨ (x ∧ z) ∀x, y, z ∈ L, (1.2a)
x ∨ (y ∧ z) = (x ∨ y) ∧ (x ∨ z) ∀x, y, z ∈ L. (1.2b)
Proving that the two conditions (1.2a) and (1.2b) are equivalent is not too hard, but is not trivial (Exer-
cise 1.9). Note that replacing the equalities with ≥ and ≤ respectively gives statements that are true for all
lattices.
The condition of distributivity seems natural, but in fact distributive lattices are quite special.
1. The Boolean algebra 2[n] is a distributive lattice, because the set-theoretic operations of union and
intersection are distributive over each other.
2. Every sublattice of a distributive lattice is distributive. In particular, Young’s lattice Y is distributive
because it is a sublattice of a Boolean lattice (recall that meet and join in Y are given by intersection
and union on Ferrers diagrams).
3. The lattices M5 and N5 are not distributive:
z
y a b c
x
(x ∨ y) ∧ z = 1̂ ∧ z = z (a ∨ b) ∧ c = c
(x ∧ z) ∨ (y ∧ z) = x ∨ 0̂ = x (a ∧ c) ∨ (b ∧ c) = 0̂.
4. The partition lattice Πn is not distributive for n ≥ 3, because Π3 ∼ = M5 , and for n ≥ 4 every Πn
contains a sublattice isomorphic to Π3 (see Exercise 1.1). Likewise, if n ≥ 2 then the subspace lattice
Ln (q) contains a copy of M5 (take any plane together with three distinct lines in it), hence is not
distributive.
5. The set Dn of all positive integer divisors of a fixed integer n, ordered by divisibility, is a distributive
lattice (Exercise 1.4).
17
Every poset P gives rise to a distributive lattice in the following way. The set J(P ) of order ideals of P (see
Definition 1.1.10) is itself a bounded poset, ordered by containment. In fact J(P ) is a distributive lattice:
the union or intersection of order ideals is an order ideal (this is easy to check) which means that J(P ) is a
sublattice of the distributive lattice BoolP . (See Figure 1.4 for an example.)
abcd
abc acd
b d ac cd
a c a c
P J(P )
For example, if P is an antichain, then every subset is an order ideal, so J(P ) = BoolP , while if P is a chain
with n elements, then J(P ) is a chain with n + 1 elements. As an infinite example, if P = N2 with the
product ordering (i.e., (x, y) ≤ (x0 , y 0 ) if x ≤ x0 and y ≤ y 0 ), then J(P ) is Young’s lattice Y .
Remark 1.3.2. There is a natural bijection between J(P ) and the set of antichains of P , since the maximal
elements of any order ideal form an antichain that generates it. (Recall that an antichain is a set of elements
that are pairwise incomparable.) Moreover, for each order ideal I, the order ideals covered by I in J(P ) are
precisely those of the form I 0 = I \ {x}, where x is a maximal element of I. In particular |I 0 | = |I| − 1 for
all such I 0 , and it follows by induction that J(P ) is ranked by cardinality.
We will shortly prove Birkhoff’s theorem (Theorem 1.3.7), a.k.a. the Fundamental Theorem of Finite Dis-
tributive Lattices: the finite distributive lattices are exactly the lattices of the form J(P ), where P is a finite
poset.
Definition 1.3.3. Let L be a lattice. An element x ∈ L is join-irreducible if it cannot be written as the
join of two other elements. That is, if x = y ∨ z then either x = y or x = z. The subposet (not sublattice!)
of L consisting of all join-irreducible elements is denoted Irr(L). Here is an example.
b f
e d b d
a c a c
L Irr(L)
If L is finite, then an element of L is join-irreducible if it covers exactly one other element. (This is not true
in a lattice such as R under the natural order, in which there are no covering relations!) The condition of
finiteness can be relaxed; see Exercise 1.11.
18
Definition 1.3.4. A factorization of x ∈ L is an equation of the form
x = p1 ∨ · · · ∨ pn
where p1 , . . . , pn ∈ Irr(L). The factorization is irredundant if the pi form an antichain.
In analogy with ring theory, call a lattice Artinian if it has no infinite descending chains. (For example, L
is Artinian if it is finite, or chain-finite, or locally finite and has a 0̂.) If L is Artinian, then every element
x ∈ L has a factorization — if x itself is not join-irreducible, express it as a join of two smaller elements,
then repeat. Moreover, every factorization can be reduced to an irredundant factorization by deleting each
factor strictly less than another (which does not change the join of the factors). Throughout the rest of
the section, we will assume that L is Artinian.
For general lattices, irredundant factorizations need not be unique. For example, the 1̂ element of M5 can
be factored irredundantly as the join of any two atoms. On the other hand, distributive lattices do exhibit
unique factorization, as we will soon prove (Proposition 1.3.6).
Proposition 1.3.5. Let L be a distributive lattice and let p ∈ Irr(L). Suppose that p ≤ q1 ∨ · · · ∨ qn . Then
p ≤ qi for some i.
Proof. By distributivity,
p = p ∧ (q1 ∨ · · · ∨ qn ) = (p ∧ q1 ) ∨ · · · ∨ (p ∧ qn )
and since p is join-irreducible, it must equal p ∧ qi for some i, whence p ≤ qi .
Proposition 1.3.5 is a lattice-theoretic analogue of the statement that if a prime p divides a product of
positive numbers, then it divides at least one of them. (This is in fact exactly what the result says when
applied to the divisor lattice Dn .)
Proposition 1.3.6 (Unique factorization for distributive lattices). Let L be a distributive lattice. Then
every x ∈ L can be written uniquely as an irredundant join of join-irreducible elements.
19
Corollary 1.3.8. Every finite distributive lattice L is graded.
Proof. The FTFDL says that L ∼ = J(P ) for some finite poset P . Then L is ranked by Remark 1.3.2, and it
is bounded with 0̂ = ∅ and 1̂ = P .
Corollary 1.3.9. Let L be a finite distributive lattice. The following are equivalent:
1. L is a Boolean algebra.
2. Irr(L) is an antichain.
3. L is atomic (i.e., every element in L is the join of atoms). Equivalently, every join-irreducible element
is an atom.
4. L is complemented. That is, for each x ∈ L, there exists a unique element x̄ ∈ L such that x ∨ x̄ = 1̂
and x ∧ x̄ = 0̂.
5. L is relatively complemented. That is, for every interval [y, z] ⊆ L and every x ∈ [y, z], there exists
a unique element u ∈ [y, z] such that x ∨ u = z and x ∧ u = y.
(4) =⇒ (3): Suppose that L is complemented, and suppose that y ∈ Irr(L) is not an atom. Let x be an atom
in [0̂, y]. Then
(x ∨ x̄) ∧ y = 1̂ ∧ y = y
(x ∨ x̄) ∧ y = (x ∧ y) ∨ (x̄ ∧ y) = x ∨ (x̄ ∧ y)
(3) =⇒ (2): This follows from the observation that no two atoms are comparable.
Join and meet could have been interchanged throughout this section. For example, the dual of Proposi-
tion 1.3.6 says that every element in a distributive lattice L has a unique “cofactorization” as an irredundant
meet of meet-irreducible elements, and L is Boolean iff every element is the meet of coatoms. (In this case
we would require L to be Noetherian instead of Artinian — i.e., to contain no infinite increasing chains. For
example, Young’s lattice is Artinian but not Noetherian.)
Definition 1.4.1. A lattice L is modular if every x, y, z ∈ L with x ≤ z satisfy the modular equation:
x ∨ (y ∧ z) = (x ∨ y) ∧ z. (1.4)
Note that for all lattices, if x ≤ z, then x ∨ (y ∧ z) ≤ (x ∨ y) ∧ z. Modularity says that, in fact, equality holds.
20
x∨y z x∨y z
x y∧z x y∧z
Modular Non-modular
The term “modularity” arises in algebra: a canonical example of a modular lattice is the poset of modules
over any ring, ordered by inclusion (Corollary 1.4.3).
x ∨ (y ∧ z) = (x ∨ y) ∧ (x ∨ z) = (x ∨ y) ∧ z.
3. The lattice L is modular if and only if its dual L∗ is modular. Unlike the corresponding statement for
distributivity, this is immediate, because the modular equation is invariant under dualization.
4. The nonranked lattice N5 is not modular.
z
y
x
Here x ≤ z, but
x ∨ (y ∧ z) = x ∨ 0̂ = x,
(x ∨ y) ∧ z = 1̂ ∧ z = z.
In fact, N5 is the unique obstruction to modularity, as we will soon see (Thm. 1.4.5).
5. The nondistributive lattice M5 ∼= Π3 is modular. However, Π4 is not modular (exercise).
Theorem 1.4.2. [Characterizations of modularity] Let L be a lattice. Then the following are equivalent:
(a) L is modular.
(b) For all x, y, z ∈ L, if x ∈ [y ∧ z, z], then x = (x ∨ y) ∧ z.
(c) For all x, y, z ∈ L, if x ∈ [y, y ∨ z], then x = (x ∧ z) ∨ y.
(d) For all y, z ∈ L, the lattices L0 = [y ∧ z, z] and L00 = [y, y ∨ z] are isomorphic, via the maps
α : L0 → L00 β : L00 → L0
q 7→ q ∨ y, p 7→ p ∧ z.
21
(b) =⇒ (a): Suppose that (b) holds. Let a, b, c ∈ L with a ≤ c. Then
b ∧ c ≤ a ∨ (b ∧ c) ≤ c ∨ c = c
(b) ⇐⇒ (c): These two conditions are duals of each other (i.e., L satisfies (b) iff L∗ satisfies (c)), and
modularity is a self-dual condition.
(b)+(c) ⇐⇒ (d): The functions α and β are always order-preserving functions with the stated domains and
ranges. Conditions (b) and (c) say respectively that β ◦ α and α ◦ β are the identities on L0 and L00 ; together,
these conditions are equivalent to condition (d).
Corollary 1.4.3. Let R be a (not necessarily commutative) ring and M a (left) R-submodule. Then the
(possibly infinite) poset L(M ) of (left) R-submodules of M , ordered by inclusion, is a modular lattice with
operations Y ∨ Z = Y + Z and Y ∧ Z = Y ∩ Z.
= L(Z/(Y ∩ Z)) ∼
[Y ∩ Z, Z] ∼ = L((Y + Z)/Y ) ∼
= [Y, Y + Z]
In particular, the subspace lattices Ln (q) are modular (see Example 1.2.5).
Example 1.4.4. For a (finite) group G, let L(G) denote the lattice of subgroups of G, with operations
H ∧ K = H ∩ K and H ∨ K = HK (i.e., the group generated by H ∪ K). If G is abelian then L(G) is always
modular, but if G is non-abelian then modularity can fail.
For example, let G = S4 , let X and Y be the cyclic subgroups generated by the cycles (1 2 3) and (3 4)
respectively, and let Z = A4 (the alternating group). Then (XY ) ∩ Z = Z but X(Y ∩ Z) = Z. Indeed, these
groups generate a sublattice of L(S4 ) isomorphic to N5 :
S4
A4
h(3 4)i
h(1 2 3)i
{Id}
22
Proof. Both =⇒ directions are easy, because distributivity and modularity are conditions inherited by
sublattices, and N5 is not modular and M5 is not distributive.
Suppose that x, y, z is a triple for which modularity fails. One can check that
x∨y
(x ∨ y) ∧ z
x∧y
Suppose that L is not distributive. If it isn’t modular then it contains an N5 , so there is nothing to prove.
If it is modular, then choose x, y, z such that
x ∧ (y ∨ z) > (x ∧ y) ∨ (x ∧ z).
You can then show that
x∧y∧z
A corollary is that every modular lattice is graded, because a non-graded lattice must contain a sublattice
isomorphic to N5 . The details are left to the reader; we will eventually prove the stronger statement that
every semimodular lattice is graded.
Recall that the notation x l y means that x is covered by y, i.e., x < y and there exists no z strictly between
x, y (i.e., such that x < z < y).
Definition 1.5.1. A lattice L is (upper) semimodular if for all incomparable x, y ∈ L,
x ∧ y l y =⇒ x l x ∨ y. (1.5)
Conversely, L is lower semimodular if the converse holds.
23
Note that both upper and lower semimodularity are inherited by sublattices, and that L is upper semimodular
if and only if its dual L∗ is lower semimodular. Also, the implication (1.5) is trivially true if x and y are
comparable. If they are incomparable (as we will often assume), then there are several useful colloquial
rephrasings of semimodularity:
• “If meeting with x merely nudges y down, then joining with y merely nudges x up.”
• In the interval [x ∧ y, x ∨ y] ⊆ L pictured below, if the southeast relation is a cover, then so is the
northwest relation.
x∨y (1.6)
•
x y
⇐
=
•
x∧y
• This condition is often used symmetrically: if x, y are incomparable and they both cover x ∧ y, then
they are both covered by x ∨ y.
• Contrapositively, “If there is other stuff between x and x ∨ y, then there is also other stuff between
x ∧ y and y.”
Example 1.5.2. The partition lattice Πn is an important example of an upper semimodular lattice. To see
that it is USM, let π and σ be incomparable set partitions of [n], and suppose that σ m σ ∧ π. Recall that this
means that σ ∧ π can be obtained from σ by splitting some block B ∈ σ into two sub-blocks B 0 , B 00 . More
specifically, we can write σ = A1 | · · · |Ak |B and σ ∧ π = A1 | · · · |Ak |B 0 |B 00 , where B is the disjoint union of
B 0 and B 00 . Since σ ∧ π refines π but σ does not, we know that A1 , . . . , Ak , B 0 , B 00 are all subsets of blocks
of π but B is not; in particular B 0 and B 00 are subsets of different blocks of π, say C 0 and C 00 respectively.
But then merging C 0 and C 00 produces a partition τ that covers π and is refined by σ, so it must be the case
that τ = σ ∨ π, and we have proved that Πn is USM. J
Lemma 1.5.3. If a lattice L is modular, then it is both upper and lower semimodular.
Proof. If x ∧ y l y, then the sublattice [x ∧ y, y] has only two elements. If L is modular, then condition (d)
of the characterization of modularity (Theorem 1.4.2) implies that [x ∧ y, y] ∼= [x, x ∨ y], so x l x ∨ y. Hence
L is upper semimodular. The dual argument proves that L is lower semimodular.
In fact, upper and lower semimodularity together imply modularity. We will show that any of these three
conditions on a lattice L implies that it is graded, and that its rank function r satisfies
In other words, if it only takes one step to walk up from q to r, then it takes at most one step to walk from
q ∨ s to r ∨ s.
24
• If p = r, then q ∨ s ≥ r. So q ∨ s = r ∨ (q ∨ s) = (r ∨ q) ∨ s = r ∨ s.
• If p = q, then p = (q ∨ s) ∧ r = q l r. Applying semimodularity to the diamond figure below, we obtain
(q ∨ s) l (q ∨ s) ∨ r = r ∨ s.
r∨s
•
q∨s r
•
p = (q ∨ s) ∧ r = q
Theorem 1.5.5. Let L be a finite lattice. Then L is USM if and only if it is ranked, with rank function r
satisfying the submodular inequality or semimodular inequality
r(x ∨ y) + r(x ∧ y) ≤ r(x) + r(y) ∀x, y ∈ L. (1.7)
Proof. ( ⇐= ) Suppose that L is a ranked lattice with rank function r satisfying (1.7). Suppose that x, y are
incomparable and x∧yly so that r(y) = r(x∧y)+1. Incomparability implies x∨y > x, so r(x∨y)−r(x) > 0.
On the other hand, rearranging (1.7) gives
0 < r(x ∨ y) − r(x) ≤ r(y) − r(x ∧ y) = 1
so r(x ∨ y) − r(x) = 1, i.e., x ∨ y m x.
Let L0 = [x1 , 1̂] and L00 = [y1 , 1̂]. (See Figure 1.5.) By induction, these sublattices are both ranked.
Moreover, c(L0 ) = n − 1. If x1 = y1 then Y and X are both saturated chains in the ranked lattice L0 and
we are done, so suppose that x1 6= y1 . Let z2 = x1 ∨ y1 . By (1.8), z2 covers both x1 and y1 . Let z2 , . . . , 1̂
be a saturated chain in L (thus, in L0 ∩ L00 ).
Since L0 is ranked and z m x1 , the chain z1 , . . . , 1̂ has length n − 2. So the chain y1 , z1 , . . . , 1̂ has length n − 1.
On the other hand, L00 is ranked and y1 , y2 , . . . , 1̂ is a saturated chain, so it also has length n − 1. Therefore
the chain 0̂, y1 , . . . , 1̂ has length n as desired.
Second, we show that the rank function r of L satisfies (1.7). Let x, y ∈ L and take a saturated chain
x ∧ y = c0 l c1 l · · · l cn−1 l cn = x.
5 Recall that the length of a saturated chain is the number of minimal relations in it, which is one less than its cardinality as
25
1̂
xn−1 ym−1
L0 L00
x3 z3
x2 z2 y2
x1 y1
0̂
y = c0 ∨ y ≤ c1 ∨ y ≤ · · · ≤ cn ∨ y = x ∨ y.
By Lemma 1.5.4, each ≤ in this chain is either an equality or a covering relation. Therefore, the distinct
elements ci ∨ y form a saturated chain from y to x ∨ y, whose length must be ≤ n. Hence
The same argument shows that L is lower semimodular if and only if it is ranked, with a rank function
satisfying the reverse inequality of (1.7).
Theorem 1.5.6. L is modular if and only if it is ranked, with rank function r satisfying the modular
equality
r(x ∨ y) + r(x ∧ y) = r(x) + r(y) ∀x, y ∈ L. (1.9)
Proof. If L is modular, then it is both upper and lower semimodular, so the conclusion follows by Theo-
rem 1.5.5. On the other hand, suppose that L is a lattice whose rank function r satisfies (1.9). Let x ≤ z ∈ L.
We already know that x ∨ (y ∧ z) ≤ (x ∨ y) ∧ z, so it suffices to show that these two elements have the same
rank. Indeed,
26
and
The following construction gives the prototype of a geometric lattice. Let k be a field, let V be a vector
space over k, and let E be a finite subset of V (with repeated elements allowed). Say that a flat is a subset
of E of the form W ∩ E, where W ⊆ E is a vector subspace. Define the vector lattice of E as
L(E) ∼
= {W ∩ E : W ⊆ V is a vector subspace}. (1.11)
the family of vector subspaces of V generated by subsets of E. (Of course, different subspaces of W can
have the same intersection with E, and different subsets of E can span the same vector space.) The poset
L(E) is easily checked to be a lattice under the operations
(W ∩ E) ∧ (X ∩ E) = (W ∩ X) ∩ E, (W ∩ E) ∨ (X ∩ E) = (W + X) ∩ E.
The elements of L(E) are called flats. Certainly E = V ∩ E is a flat, hence the top element of L(E). The
bottom element is O ∩ E, where O ⊆ V is the zero subspace; thus O ∩ E consists of the copies of the zero
vector in E.
The tricky thing about the isomorphism (1.11) is that it is not so obvious which elements of E are flats. For
every A ⊆ E, there is a unique minimal flat containing A, namely Ā := kA ∩ E — that is, the set of elements
of E in the linear span of A. On the other hand, if v, w, x ∈ E with v + w = x, then {v, w} is not a flat,
because any vector subspace that contains both v and w must also contain x. So, an equivalent definition
of “flat” is that A ⊆ E is a flat if no vector in E \ A is in the linear span of the vectors in A.
The lattice L(E) is submodular, with rank function r(A) = dim kA. (Exercise: Check that r satisfies the
submodular inequality.) It is not in general modular; e.g., see Example 1.6.3 below. On the other hand,
L(E) is always an atomic lattice: every element is the join of atoms. This is a consequence of the simple
fact that khv1 , . . . , vk i = kv1 + · · · + kvk . This motivates the following definition:
Definition 1.6.1. A lattice L is geometric if it is (upper) semimodular and atomic. If L ∼
= L(E) for some
set of vectors E, we say that E is a (linear) representation of L.
For example, the set E = {(0, 1), (1, 0), (1, 1)} ⊆ F22 is a linear representation of the geometric lattice M5 .
(For that matter, so is any set of three nonzero vectors in a two-dimensional space over any field, provided
none is a scalar multiple of another.)
(An affine subspace of V is a translate of a vector subspace; for example, a line or plane not necessarily
containing the origin.) In fact, any lattice of the form Laff (E) can be expressed in the form L(Ê), where Ê is
27
a certain point set constructed from E (homework problem). However, the dimension of the affine span of a
set A ⊆ E is one less than its rank — which means that we can draw geometric lattices of rank 3 conveniently
as planar point configurations. If L ∼
= Laff (E), we could say that E is a (affine) representation of L.
Example 1.6.2. Let E = {a, b, c, d}, where a, b, c are collinear but no other set of three points is. Then
Laff (E) is the lattice shown below (which happens to be modular).
abcd
a
d
abc ad bd cd
b
a b c d
c
Example 1.6.3. If E is the point configuration on the left with the only collinear triples {a, b, c} and
{a, d, e}, then Laff (E) is the lattice on the right.
e abcde
abc bd be cd ce ade
d
b c a d e
∅
a b c
This lattice is not modular: consider the two elements bd and ce. J
Example 1.6.4. Recall from Example 1.5.2 that the partition lattice Πn is USM for all n. In fact it is
geometric. To see that it is atomic, observe that the atoms are the set partitions with n−1 blocks, necessarily
one doubleton block and n − 2 singletons; let πij denote the atom whose doubleton block is {i, j}. Then
every set partition σ is the join of the set {πij : i ∼σ j}.
In fact, Πn is a vector lattice. Let k be any field, llet {e1 , . . . , en } be the standard basis of V = kn , let
pij = ei − ej for all 1 ≤ i < j ≤ n, and let E be the set of all such vectors pij .. Then in fact Πn ∼
= L(E). The
atoms πij of Πn correspond to the atoms khpij i of L(E); the rest of the isomorphism is left as Exercise 1.17.
Note that this construction works over any field k.
More generally, if G is any simple graph on vertex set [n] then the connectivity lattice K(G) is isomorphic
to L(EG ), where EG = {aij : ij is an edge of G}. J
28
1.7 Exercises
Posets
Exercise 1.1. (a) Prove that every nonempty interval in a Boolean algebra is itself isomorphic to a
Boolean algebra.
(b) Prove that every interval in the subspace lattice Ln (q) is isomorphic to a subspace lattice.
(c) Prove that every interval in the partition lattice Πn is isomorphic to a product of partition lattices.
(The product of posets P1 , . . . , Pk is the Cartesian product P1 × · · · × Pk , equipped with the partial
order (x1 , . . . , xk ) ≤ (y1 , . . . , yk ) if xi ≤Pi yi for all i ∈ [k].)
Exercise 1.2. A directed acyclic graph (or DAG) is a pair G = (V, E), where V is a set of vertices; E
is a set of edges, each of which is an ordered pair of distinct vertices; and E contains no directed cycles, i.e.,
no subsets of the form {(v1 , v2 ), (v2 , v3 ), . . . , (vn−1 , vn ), (vn , v1 )} for any v1 , . . . , vn ∈ V .
(a) Let P be a poset with order relation <. Let E = {(v, w) : v, w ∈ P, v < w}. Prove that the pair
(P, E) is a DAG.
(b) Let G = (V, E) be a DAG. Define a relation < on V by setting v < w iff there is some directed path
from v to w in G, i.e., iff E has a subset of the form {(v = v1 , v2 ), (v2 , v3 ), . . . , (vn−1 , vn = w)} with
all vi distinct. Prove that this relation makes V into a poset.
(This problem is purely a technical exercise and is almost tautological, but it does show that posets and
DAGs are essentially the same thing.)
Exercise 1.3. Recall from Definition 1.1.9 that L (P ) means the set of linear extensions of a poset P .
(a) Let P and Q be posets. Describe L (P + Q) and L (P ⊕ Q) in terms of L (P ) and L (Q). (Hint:
Start by working out some small examples explicitly. The problem is nontrivial even when P and Q
are both chains of length 1.)
(b) Give a concrete combinatorial description of L (Booln ).
Exercise 1.4. Let n be a positive integer. Let Dn be the set of all positive-integer divisors of n (including n
itself), partially ordered by divisibility.
(a) Prove that Dn is a ranked poset, and describe the rank function.
(b) For which values of n is Dn (i) a chain; (ii) a Boolean algebra? For which values of n, m is it the case
that Dn ∼
= Dm ?
(c) Prove that Dn is a distributive lattice. Describe its meet and join operations and its join-irreducible
elements.
(d) Prove that Dn is self-dual, i.e., there is a bijection f : Dn → Dn such that f (x) ≤ f (y) if and only if
x ≥ y.
Exercise 1.5. Let G be a graph on vertex set V = [n]. Recall from Example 1.2.3 that the connectivity
lattice of a graph is the subposet K(G) of Πn consisting of set partitions in which every block induces a
connected subgraph of G. Prove that K(G) is a lattice. Is it a sublattice of Πn ?
Exercise 1.6. Let A be a finite family of sets. For A0 ⊆ A, define ∪A0 = A∈A0 A. Let U (A) = {∪A0 : A0 ⊆
S
A}, considered as a poset ordered by inclusion.
(a) Prove that U (A) is a lattice. (Hint: Don’t try to specify the meet operation explicitly.)
(b) Construct a set family A such that U (A) is isomorphic to weak Bruhat order on S3 (see Example
2.11).
(c) Construct a set family A such that U (A) is not ranked.
29
(d) Is every finite lattice of this form?
Exercise 1.7. For 1 ≤ i ≤ n − 1, let si be the transposition in Sn that swaps i with i + 1. (The si are
called elementary transpositions.) You probably know that {s1 , . . . , sn−1 } is a generating set for Sn (and if
you don’t, you will shortly prove it). For w ∈ Sn , an expression w = si1 · · · sik is called a reduced word if
there is no way to express w as a product of fewer than k generators.
(a) Show that every reduced word for w has length equal to inv(w). (For the definition of inv(w), see
Example 1.2.13.)
(b) Define a partial order ≺ on Sn as follows: w ≺ v if there exists a reduced word si1 · · · sik for v such
that w is the product of some subword w = sij1 · · · sij` . (Sorry about the triple subscripts; this just
means that v is obtained by deleting some of the letters from the reduced word for w.) Prove that ≺
is precisely Bruhat order on Sn .
Exercise 1.8. Prove that the rank-generating functions of weak order and Bruhat order on Sn are both
n
X Y 1 − qi
q r(w) =
i=1
1−q
w∈Sn
where r(w) = #{{i, j} : i < j and wi > wj }. (Hint: Induct on n, and use one-line notation for permutations,
not cycle notation.)
Distributive lattices
Exercise 1.9. Prove that the two formulations (1.2a) and (1.2b) of distributivity of a lattice L are equivalent,
i.e.,
x ∧ (y ∨ z) = (x ∧ y) ∨ (x ∧ z) ∀x, y, z ∈ L ⇐⇒ x ∨ (y ∧ z) = (x ∨ y) ∧ (x ∨ z) ∀x, y, z ∈ L.
Exercise 1.10. In Exercise 1.4 you proved that the divisor lattice Dn is distributive. Characterize all posets
P such that J(P ) ∼ = Dn for some n ∈ N. (In other words, prove a statement of the form “A distributive
lattice L = J(P ) is isomorphic to a divisor lattice if and only if the poset P = Irr(L) is .”)
Exercise 1.11. Let L be a finite lattice and x ∈ L. Prove that x is join-irreducible if it covers exactly one
other element. What weaker conditions than “finite” suffice?
Exercise 1.12. Let Y be Young’s lattice (which we know is distributive).
(a) For a finite distributive lattice L, show that the map φ : L → J(Irr(L)) given by
φ(x) = hp : p ∈ Irr(L), p ≤ xi
30
(b) For a finite poset P , show that an order ideal in P is join-irreducible in J(P ) if and only if it is principal
(i.e., generated by a single element).
Exercise 1.14. Let L be a sublattice of Booln that is accessible: if S ∈ L \ {∅} then there exists some
x ∈ S such that S \ {x} ∈ L. Construct a poset P on [n] such that J(P ) = L. (Notice that I wrote “= L”,
not “∼
= L.” It is not enough to invoke Birkhoff’s theorem to say that such a P must exist! The point is to
explicitly construct a poset P on [n] whose order ideals are the sets in L.)
Modular lattices
Exercise 1.15. Let Ln (q) be the poset of subspaces of an n-dimensional vector space over the finite field
Fq (so Ln (q) is a modular lattice by Corollary 1.4.3).
(a) Prove directly from the definition of modularity that Ln (q) is modular. (I.e., verify algebraically that
the join and meet operations obey the modular equation (1.4).)
(b) Calculate the rank-generating function
X n
X
xdim V = xk #{V ∈ Ln (q) : dim V = k}.
V ∈Ln (q) k=0
Hint: Every vector space of dimension k is determined by an ordered basis v1 , . . . , vk . How many
ordered bases does each k-dimensional vector space V ∈ Ln (q) have? How many sequences of vectors
in Fnq are ordered bases for some k-dimensional subspace?
(c) Count the maximal chains in Ln (q).
Exercise 1.16. Verify that the lattice Π4 is not modular.
Exercise 1.17. Prove that the lattices Πn and L(E) are isomorphic, where E is the vector set described in
Example 1.6.4. To do this, you need to characterize the vector spaces spanned by subsets of A ⊆ E and show
that they are in bijection with set partitions. Hint: It may be useful to look at the orthogonal complements
of those vector spaces under the standard inner product on kn .
Exercise 1.18. The purpose of this exercise is to show that the constructions L and Laff produce the same
class of lattices. Let k be a field and let E = {e1 , . . . , en } ⊆ kd .
(a) The augmentation of a vector ei = (ei1 , . . . , eid ) is the vector ẽi = (1, ei1 , . . . , eid ) ∈ kd+1 . Prove that
Laff (E) = L(Ẽ), where Ẽ = {ẽ1 , . . . , ẽn }.
(b) Let v be a vector in kd that is not a scalar multiple of any ei , let H Let H ⊆ kd be a generic affine
hyperplane, let êi be the projection of ei onto H, and let Ê = {ê1 , . . . , ên }. Prove that L(E) = Laff (Ê).
(The first part is figuring out what “generic” means. A generic hyperplane might not exist for all fields,
but if k is infinite then almost all hyperplanes are generic.)
Exercise 1.19. Recall from Corollary 1.3.9 that a lattice L is relatively complemented if, whenever y ∈
[x, z] ⊆ L, there exists u ∈ [x, z] such that y ∧ u = x and y ∨ u = z. Prove that a finite semimodular lattice
is atomic (hence geometric) if and only if it is relatively complemented.
(Here is the geometric interpretation of being relatively complemented. Suppose that V is a vector space,
L = L(E) for some point set E ⊆ V , and that X ⊆ Y ⊆ Z ⊆ V are vector subspaces spanned by flats of
31
L(E). For starters, consider the case that X = O. Then we can choose a basis B of the space Y and extend
it to a basis B 0 of Z, and the vector set B 0 \ B spans a subspace of Z that is complementary to Y . More
generally, if X is any subspace, we can choose a basis B for X, extend it to a basis B 0 of Y , and extend B 0
to a basis B 00 of Z. Then B ∪ (B 00 \ B 0 ) spans a subspace U ⊆ Z that is relatively complementary to Y , i.e.,
U ∩ Y = X and U + Y = Z.)
32
Chapter 2
Poset Algebra
Throughout this chapter, every poset we consider will be assumed to be locally finite, i.e., every interval
is finite.
Let P be a poset and let Int(P ) denote the set of (nonempty) intervals of P . Recall that an interval is a
subset of P of the form [x, y] := {z ∈ P : x ≤ z ≤ y}; if x 6≤ y then [x, y] = ∅.
Definition 2.1.1. The incidence algebra I(P ) is the set of functions α : Int(P ) → C (“incidence func-
tions”)1 , made into a C-vector space with pointwise addition, subtraction and scalar multiplication, and
equipped with the convolution product:
X
(α ∗ β)(x, y) = α(x, z)β(z, y).
z∈[x,y]
Here we abbreviate α([x, y]) by α(x, y), and it is often convenient to set α(x, y) = 0 if x 6≤ y. Note that
the assumption of local finiteness is both necessary and sufficient for convolution to be well-defined for all
incidence functions.
33
Proof. The basic idea is to reverse the order of summation:
X
[(α ∗ β) ∗ γ](x, y) = (α ∗ β)(x, z) · γ(z, y)
z∈[x,y]
X X
= α(x, w)β(w, z) γ(z, y)
z∈[x,y] w∈[x,z]
X
= α(x, w)β(w, z)γ(z, y)
w,z: x≤w≤z≤y
X X
= α(x, w) β(w, z)γ(z, y)
w∈[x,y] z∈[w,y]
X
= α(x, w) · (β ∗ γ)(w, y)
w∈[x,y]
= [α ∗ (β ∗ γ)](x, y).
The multiplicative identity of I(P ) is the Kronecker delta function, regarded as an incidence function:
(
1 if x = y,
δ(x, y) =
0 if x 6= y.
This formula is well-defined by induction on the size of [x, y], with the cases x = y and x 6= y serving as the
base case and inductive step respectively.
Proof. Let β be a left convolution inverse of α. In particular, α(x, x) = β(x, x)−1 for all x (use the equation
(α ∗ β)(x, x) = δ(x, x) = 1), so the nonzero condition is necessary. On the other hand, if x < y, then
X
(β ∗ α)(x, y) = β(x, z)α(z, y) = δ(x, y) = 0
z∈[x,y]
and solving for β(x, y) gives the formula (2.1) (pull the term β(x, y)α(y, y) out of the sum), which is well-
defined provided that α(y, y) 6= 0. So the nonzero condition is also sufficient. A similar argument shows that
the nonzero condition is necessary and sufficient for α to have a right convolution inverse. Moreover, the left
and right inverses coincide: if β ∗ α = δ = α ∗ γ then β = β ∗ δ = β ∗ α ∗ γ = γ by associativity.
34
These trivial-looking incidence functions are useful because their convolution powers count important things,
namely multichains and chains in P In other words, enumerative questions about posets can be expressed
algebraically. Specifically,
X X
ζ 2 (x, y) = ζ(x, z)ζ(z, y) = 1
z∈[x,y] z∈[x,y]
= #{z : x ≤ z ≤ y},
X X X
3
ζ (x, y) = ζ(x, z)ζ(z, w)ζ(w, y) = 1
z∈[x,y] w∈[z,y] x≤z≤w≤y
= #{(z, w) : x ≤ z ≤ w ≤ y},
k
ζ (x, y) = #{(x1 , . . . , xk−1 ) : x ≤ x1 ≤ x2 ≤ · · · ≤ xk−1 ≤ y}.
That is, ζ k (x, y) counts the number of multichains of length k between x and y (chains with possible
repeats) . If we replace ζ with η, then the calculations all work the same way, except that all the ≤’s are
replaced with <’s, so we get
η k (x, y) = #{(x1 , . . . , xk−1 ) : x < x1 < x2 < · · · < xk−1 < y},
the number of chains of length k (not necessarily saturated) between x and y. In particular, if the chains
of P are bounded in length, then η n = 0 for n 0.
Direct products of posets play nicely with the incidence algebra construction. Specifically, let P, Q be
bounded finite posets. For α ∈ I(P ) and φ ∈ I(Q), define αφ ∈ I(P × Q) by
αφ[(x, x0 ), (y, y 0 )] = α(x, y)φ(x0 , y 0 ).
This defines a linear transformation F : I(P ) ⊗ I(Q) → I(P × Q). 2 In other words, (α + β)φ = αφ + βφ, and
α(φ + ψ) = αφ + αψ, and α(cφ) = (cα)φ = c(αφ) for all c ∈ C. It is actually a vector space isomorphism,
because there is a bijection Int(P ) × Int(Q) → Int(P × Q) given by (I, J) → I × J, and F (χI ⊗ χJ ) = χI×J
(where χI is the characteristic function of I, i.e., the incidence function that is 1 on I and zero on other
intervals). In fact, more is true:
Proposition 2.1.4. The map F just defined is a ring isomorphism. That is, for all α, β ∈ I(P ) and
φ, ψ ∈ I(Q),
αφ ∗ βψ = (α ∗ β)(φ ∗ ψ).
Furthermore, the incidence functions δ and ζ are multiplicative on direct products, i.e.,
δP ×Q = δP δQ and ζP ×Q = ζP ζQ .
35
2.2 The Möbius function
The Möbius function µP of a poset P is defined as the convolution inverse of its zeta function: µP = ζP−1 .
This turns out to be one of the most important incidence functions on a poset. For a bounded poset, we
abbreviate µP (x) = µP (0̂, x) and µ(P ) = µP (0̂, 1̂). Proposition 2.1.3 provides a recursive formula for µ:
0
if y 6≥ x (i.e., if [x, y] = ∅),
µ(x, y) = 1 if y = x, (2.2)
P
− z: x≤z<y µ(x, z) if x < y.
This is equivalent to the familiar recursive formula: to find µP (x), add up the values of µP at all elements
< x, then change the sign.
Example 2.2.1. If P = {0 < 1 < 2 < · · · } is a chain, then its Möbius function is given by µ(x, x) = 1,
µ(x, x + 1) = −1, and µ(x, y) = 0 otherwise. J
Example 2.2.2. Here are the Möbius functions µP (x) = µP (0̂, x) for the lattices N5 and M5 :
1 2
0
−1 −1 −1 −1
−1
N5 M5
1 1
And here are the Boolean lattice Bool3 and the divisor lattice D24 :
0 24
−1
0 12 8 0
1 1 1
1 6 4 0
−1 −1 −1
−1 3 2 −1
1 Bool3 1 1 D24
Example 2.2.3 (Möbius functions of partition lattices). What is µ(Πn ) in terms of n? Clearly µ(Π1 ) = 1
and µ(Π2 ) = −1, and µ(Π3 ) = µ(M5 ) = 2. For n = 4, we calculate µ(Π4 ) from (2.2). The value of µΠ4 (0̂, π)
depends only on the block sizes of π, in fact, [0̂, π] ∼
= Ππ1 × · · · × Ππk . We will use the fact that the Möbius
function is multiplicative on direct products; we will prove this shortly (Prop. 2.2.5).
36
Block sizes Number of π’s Isomorphism type of [0̂, π] µ(0̂, π)
1,1,1,1 1 Π1 1
2,1,1 6 Π2 −1
2,2 3 Π 2 × Π2 1
3,1 4 Π3 2
Adding up the last column and multiplying by −1 gives µ(Π5 ) = 24. At this point you might guess that
µ(Πn ) = (−1)n−1 (n − 1)!, and you would be right. We will prove this soon. J
The Möbius function is useful in many ways. It can be used to formulate a more general version of inclusion-
exclusion called Möbius inversion. It behaves nicely under poset operations such as product, and has
geometric and topological applications. Even just the single number µ(P ) = µP (0̂, 1̂) tells you a lot about a
bounded poset P . Confusingly, this number itself is sometimes called the “Möbius function” of P (a better
term would be “Möbius number”). Here is the reason.
Definition 2.2.4. A family F of posets is hereditary if, for each P ∈ F , every interval in P is isomorphic
to some [other] poset in F . It is semi-hereditary if every interval in a member of F is isomorphic to a
product of members of F .
For example, the families of Boolean lattices, divisor lattices, and subspace lattices are all hereditary, and the
family of partition lattices is semi-hereditary. Knowing the Möbius number for every poset in a hereditary
family is equivalent to knowing their full Möbius functions. The same is true for semi-hereditary families,
for the following reason.
Proposition 2.2.5. The Möbius function is multiplicative on direct products, i.e., µP ×Q = µP µQ (in the
notation of Proposition 2.1.4).
Proof.
ζP ×Q ∗ µP µQ = ζP ζQ ∗ µP µQ = (ζP ∗ µP )(ζQ ∗ µQ ) = δP δQ = δP ×Q
which says that µP µQ = ζP−1×Q = µP ×Q . (It is also possible to prove that µP µQ = µP ×Q directly from the
definition.)
Since µ(Bool1 ) = −1 and Booln is a product of n copies of Bool1 , an immediate consequence of Proposi-
tion 2.2.5 is the formula
µ(Booln ) = (−1)n .
37
This can also be proved by induction on n (with the cases n = 0 and n = 1 easy). If n > 0, then
n−1
X
|A|
X
k n
µ(Booln ) = − (−1) =− (−1) (by induction)
k
A([n] k=0
n
n
X
k n
= (−1) − (−1)
k
k=0
= (−1)n − (1 − 1)n = (−1)n .
In particular, the full Möbius function of the Boolean algebra BoolS is given by µ(A, B) = µ(Bool|B\A| ) =
(−1)|B\A| for all A ⊆ B ⊆ S.
Example 2.2.6. Let P be a product of k chains of lengths a1 , . . . , ak . Equivalently,
ordered by x ≤ y iff xi ≤ yi for all i. (Recall that the length of a chain is the number of covering relations,
which is one less than the number of elements; see Definition 1.1.6.) Then Prop. 2.2.5 together with the
formula for the Möbius function of a chain (above) gives
(
0 if xi ≥ 2 for at least one i;
µ(0̂, x) = s
(−1) if x consists of s 1’s and k − s 0’s.
(The Boolean algebra is the special case that ai = 1 for every i.) This conforms to the definition of Möbius
function that you may have seen in enumerative combinatorics or number theory, since products of chains
are precisely divisor lattices. As mentioned above, the family of divisor lattices is hereditary: [a, b] ∼
= Db/a
for all a, b ∈ Dn with a|b. J
Proof. Recall that ck = η k (0̂, 1̂) = (ζ − δ)k (0̂, 1̂). The trick is to use the geometric series expansion
1/(1 + h) = 1 − h + h2 − h3 + h4 − · · · . Clearing both denominators and replacing h with η and 1 with δ,
we get
∞
!
X
−1 k k
(δ + η) = (−1) η .
k=0
Despite looking like an infinite power series, this is actually a valid polynomial equation in I(P ), because
η k = 0 for k sufficiently large. Evaluating both sides on [0̂, 1̂] gives
∞
X ∞
X
(−1)k ck = (−1)k η k (0̂, 1̂) = (δ + η)−1 (0̂, 1̂) = ζ −1 (0̂, 1̂) = µ(0̂, 1̂).
k=0 k=0
38
This alternating sum looks like an Euler characteristic (see (6.2) below). In fact it is.
Corollary 2.2.8. Let P be a finite bounded poset with at least two elements, and let ∆(P ) be its order
complex, i.e., the simplicial complex (see Example 1.1.11) whose vertices are the elements of P \ {0̂, 1̂} and
whose simplices are chains. Each chain x0 = 0̂ < x1 < · · · < xk = 1̂ gives rise to a simplex {x1 , . . . , xk−1 }
of ∆(P ) of dimension k − 2. Hence fk−2 (∆(P )) = ck (P ) for all k ≥ 1, and the reduced Euler characteristic
of ∆(P ) is
def X X
χ̃(∆(P )) ≡ (−1)k fk (∆(P )) = (−1)k−2 ck (P ) = µP (0̂, 1̂).
k≥−1 k≥1
Example 2.2.9. For the Boolean algebra P = Bool3 (see Example 2.2.2), we have c0 = 0, c1 = 1, c2 = 6,
c3 = 6, and ck = 0 for k > 3. So c0 − c1 + c2 − c3 = −1 = µP (0̂, 1̂). J
Proof. This is immediate from Philip Hall’s theorem, since ck (P ) = ck (P ∗ ) for all k. (One can also prove
this fact by comparing the algebras I(P ) and I(P ∗ ); see Exercise 2.3.)
The following result is one of the most frequent applications of the Möbius function.
Theorem 2.3.1 (Möbius inversion formula). Let P be a locally finite3 , poset, let V be any C-vector
space (usually, but not always, C itself ) and let f, g : P → V . Then
X X
g(x) = f (y) ∀x ∈ P ⇐⇒ f (x) = µ(y, x)g(y) ∀x ∈ P, (2.3a)
y: y≤x y: y≤x
X X
g(x) = f (y) ∀x ∈ P ⇐⇒ f (x) = µ(x, y)g(y) ∀x ∈ P. (2.3b)
y: y≥x y: y≥x
Proof. Stanley calls the proof “A trivial observation in linear algebra”. Let V be the vector space V of
functions f : P → C. Consider the right action • and the left action • of I(P ) on V by
X
(f • α)(x) = α(y, x)f (y),
y: y≤x
X
(α • f )(x) = α(x, y)f (y).
y: y≥x
for incidence functions, snd extended linearly to I(P ). In terms of these actions, formulas (2.3a) and (2.3b)
are respectively just the “trivial” observations
g = f •ζ ⇐⇒ f = g • µ, (2.4a)
g = ζ •f ⇐⇒ f = µ • g. (2.4b)
We just have to prove that these actions are indeed actions, i.e.,
f • (α ∗ β) = (f • α) • β and (α ∗ β) • f = α • (β • f ).
3 In fact (2.3a) requires only that every principal order ideal is finite (for (2.3b), every principal order filter).
39
We prove the first identity:
X
(f • (α ∗ β))(y) = (α ∗ β)(x, y)f (x)
x: x≤y
X X
= α(x, z)β(z, y)f (x)
x: x≤y z: z∈[x,y]
X X
= α(x, z)f (x) β(z, y)
z: z≤y x: x≤z
X
= (f • α)(z)β(z, y) = ((f • α) • β)(y).
z: z≤y
which is nothing more or less than the inclusion-exclusion formula. So Möbius inversion can be thought of
as a generalized form of inclusion-exclusion in which the Boolean algebra is replaced by an arbitrary locally
finite poset P . If we know the Möbius function of P , then knowing a combinatorial formula for either f
or g allows us to write down a formula for the other one. This is frequently useful when we can express an
enumerative problem in terms of a function on a poset.
Remark 2.3.2. The proof of Möbius inversion goes through more generally for functions f, g : P → X,
where X is any C-vector space (for example, polynomials over C).
Example 2.3.3. Here’s an oldie-but-goodie. A derangement is a permutation σ ∈ Sn with no fixed
points. If Dn is the set of derangements in Sn , then
|D1 | = 0,
|D2 | = 1 = |{21}|,
|D3 | = 2 = |{231, 312}|,
|D4 | = 9 = |{2341, 2314, 2413, 3142, 3412, 3421, 4123, 4312, 4321}|,
...
Thus Dn = f (∅).
It is easy to calculate g(S) directly. If s = |S|, then a permutation fixing the elements of S is equivalent to
a permutation on [n] \ S, so g(S) = (n − s)!.
40
Rewritten in the incidence algebra I(2[n] ), this is just g = ζ • f . Thus f = µ • g, or in terms of the Möbius
inversion formula (2.3b),
n
n−s
X X X
f (S) = µ(S, R)g(R) = (−1)|R|−|S| (n − |R|)! = (−1)r−s (n − r)! .
r=s
r − s
R⊇S R⊇S
The number of derangements is then f (∅), which is given by the well-known formula
n
X n
(−1)r (n − r)!
r=0
r
J
Example 2.3.4. As a number-theoretic application, we will use Möbius inversion to compute the closed
formula for Euler’s totient function
Let n = pa1 1 · · · pas s be the prime factorization of n, and let P = {p1 , . . . , ps }. We work in the lattice
Dn ∼
= Ca1 × · · · × Cas . Warning: To avoid confusion with the cardinality symbol, we will use the symbol
≤ to mean the order relation in Dn : i.e., x ≤ y means that x divides y. For x ∈ Dn , define
On the other hand g(x) = n/x, since x ≤ gcd(a, n) iff a is a multiple of x. Moreover, φ(n) = f (1), and
(
(−1)q if y is a product of distinct elements of P ,
µ(1, y) =
0 otherwise (i.e., if p2i ≤ y for some i).
Therefore,
X
φ(n) = f (1) = µ(1, y)(n/y)
y∈Dn
X (−1)|Q|
=n Q
Q⊆P pi ∈Q pi
n X p1 · · · pr
= (−1)|Q| Q
p1 · · · pr pi ∈Q pi
Q⊆P
n X Y
= (−1)r−|S| pi
p1 · · · pr
S=P \Q⊆P pi ∈S
r
n Y
= (−1)r (1 − pi )
p1 · · · pr i=1
= pa1 1 −1 · · · par r −1 (p1 − 1) · · · (pr − 1)
as is well known. J
41
Example 2.3.5. Let G = (V, E) be a finite graph with V = [n]. We may as well assume that G is simple
(no loops or parallel edges) and connected. A coloring of G with t colors, or for short a t-coloring is
just a function κ : V (G) → [t]. An edge xy is monochromatic with respect to κ if κ(x) = κ(y), and a
coloring is proper if it has no monochromatic edges. What can we say about the number pG (t) of proper
t-colorings?
This question can be expressed in terms of the connectivity lattice K(G) (see Example 1.2.3 and Exercise 1.5).
For each t-coloring κ, let Gκ be the subgraph of G induced by the monochromatic edges, and let P (κ) be
the set partition of V (G) whose blocks are the components of Gκ ; then P (κ) is an element of K(G). The
coloring κ is proper if and only if P (κ) = 0̂K(G) , the partition of V (G) into singleton blocks. Accordingly,
if we define f : K(G) → N≥0 by
then the number of proper t-colorings is f (0̂). We can find another expression for this number by Möbius
inversion. Let X
g(π) = #{κ : P (κ) ≥ π} = f (σ).
σ≥π
The condition P (κ) ≥ π is equivalent to saying that the vertices in each block of π are colored the same. The
number of such colorings is just t|π| (choosing a color for each block, not necessarily different). Therefore,
Möbius inversion (version (2.3b)) says that
X X
pG (t) = f (0̂) = µ(0̂, π)g(π) = µ(0̂, π)t|π| .
π∈K(G) π∈K(G)
While this formula is not necessarily easy to calculate, it does show that pG (t) is a polynomial in t; it is
called the chromatic polynomial. (This fact can be proved in other ways.)
If G = Kn is the complete graph, then the connectivity lattice K(Kn ) is just the full partition lattice Πn . On
the other hand, we can calculate the chromatic polynomial of Kn directly: it is pG (t) = t(t − 1)(t − 2) · · · (t −
n + 1) (since a proper coloring must assign different colors to all vertices). Equating our two expressions for
pG (t) gives X
µ(0̂, π)t|π| = t(t − 1)(t − 2) · · · (t − n + 1).
π∈K(G)
This is an identity of polynomials in t. Extracting the coefficients of the lowest degree (t1 ) terms on each
side gives
µ(0̂, 1̂) = (−1)n−1 (n − 1)!
so we have calculated the Möbius number of the partition lattice! There are many other ways to obtain this
result. J
Example 2.3.6. Here is another way to use Möbius inversion to compute the Möbius function itself. In
this example, we will do this for the lattice Ln (q).
For small n, it is possible to work out the Möbius function of Ln (q) by hand. For instance, µ(L1 (q)) =
µ(Bool1 ) = −1, and L2 (q) is a poset of rank 2 with q+1 elements in the middle (since each line in F2q is defined
by a nonzero vector up to scalar multiples, so there are (q 2 − 1)/(q − 1) lines), so µ(L2 (q)) = −(q + 1) − 1 = q.
With a moderate amount of effort, one can check that µ(L3 (q)) = −q 3 and µ(L4 (q)) = q 6 . Here is a way
to calculate µ(Ln (q)) for general n, which will lead into the discussion of the characteristic polynomial of a
ranked poset.
42
Let V = Fnq , let L = Ln (q) (ranked by dimension) and let X be a Fq -vector space of cardinality t (yes,
cardinality, not dimension!) Let
so that X
g(W ) = f (U )
U ⊇W
For this last count, choose an ordered basis {v1 , . . . , vn } for V , and send each vi to a vector in X not in the
linear span of {φ(v1 ), . . . , φ(vi−1 )}; there are t − q i−1 such vectors. The identity (2.5) holds for infinitely
many integer values of t and is thus an identity of polynomials in the ring Q[t]. Therefore, it remains true
upon setting t to 0 (even though no vector space can have cardinality zero!), whereupon the second and
fourth terms in the equality (2.5) become
n
µLn (q) (0̂, 1̂) = (−1)n q ( 2 )
which is consistent with the n ≤ 4 cases given at the start of the example. J
The two previous examples suggest that in order to understand a finite graded poset P , one should study
the following polynomial.
Definition 2.3.7. Let P be a graded poset with rank function r. Its characteristic polynomial is
X
χ(P ; t) = µ(0̂, x)tr(1̂)−r(x) .
x∈P
In particular,
χ(P, 0) = µ(P ). (2.6)
43
Moreover, the characteristic polynomial of the Boolean lattice Booln is
n
j n
X
χ(Booln ; t) = (−1) tn−j = (t − 1)n .
j=0
j
In fact, since the Möbius function is multiplicative on direct products of posets (Proposition 2.1.4), so is the
characteristic polynomial.
The characteristic polynomial generalizes the Möbius number of a poset and contains additional information
as well. For example, let A be a hyperplane arrangement in Rn : a finite collection of affine linear S
spaces of
dimension n−1. The arrangement separates Rn into regions, the connected components of X = Rn \ H∈A H.
Let P be the poset of intersections of hyperplanes in H, ordered by reverse refinement. A famous result of
Zaslavsky, which we will prove in Chapter 5, is that |χP (−1)| and |χP (1)| count the number of regions and
bounded regions of X, respectively.
There are additional techniques we can use for computing Möbius functions and characteristic polynomials
of lattices, particularly lattices with good structural properties (e.g., semimodular). The main algebraic
object is the following ring.
Definition 2.4.1. Let L be a lattice. The Möbius algebra Möb(L) is the vector space of formal C-linear
combinations of elements of L, with multiplication given by the meet operation and extended linearly. (In
particular, 1̂ is the multiplicative unit of Möb(L).)
The elements of L form a vector space basis of Möb(L) consisting of idempotents (elements that are their
own squares), since x ∧ x = x for all x ∈ L. For example, if L = 2[n] then Möb(L) ∼ = C[x1 , . . . , xn ]/(x21 −
2
x1 , . . . , xn − xn ), with a natural vector space basis given by squarefree monomials.
It seems as though Möb(L) could have a complicated ring structure, but actually it is quite simple.
Proposition 2.4.2. Let L be a finite lattice. For x ∈ L, define
X
εx = µ(y, x)y ∈ Möb(L).
y≤x
Then the set B = {εx : x ∈ L} is a C-vector space basis for Möb(L), with εx εy = δxy εx . In particular,
Möb(L) ∼
= C|L| as rings.
Q basis for Möb(L) as claimed. Let Cx be a copy of C with unit 1x , so that we can
so B is a vector space
identify C|L| with x∈L Cx . This is the direct product of rings, with multiplication 1x 1y = δxy 1x . We claim
that the C-linear map φ : Möb(L) → C|L| given by
φ(εx ) = 1x
4 Here the vector space V of Theorem 2.3.1 is not C itself, but rather another vector space, namely A(L).
44
is a ring isomorphism. It is certainly a vector space isomorphism, and (2.7) implies that
X X X X X
φ(x)φ(y) = φ εw φ εz = 1w 1z = 1v = φ(x ∧ y).
w≤x z≤y w≤x z≤y v≤x∧y
The Möbius algebra leads to useful identities that rely on translating between the “combinatorial” basis L and
the “algebraic” basis B. Some of these identities permit computation of µ(x, y) by summing Pover a cleverly
chosen subset of [x, y], rather than the entire interval. Of course we know that µ(P ) = − x6=1̂ µ(0̂, x) for
any poset P , but calculating µ(P ) explicitly using this formula requires a recursive computation that can
be quite inefficient. The special structure of a lattice L leads to much more streamlined expressions for
µ(L). The first of these, Weisner’s theorem (Prop. 2.4.4), reduces the number of summands substantially;
it is easy to prove and has useful consequences, but is still recursive. The second, Rota’s crosscut theorem
(Thm. 2.4.9), requires more setup but is non-recursive, which makes it a more versatile tool.
Proposition 2.4.4 (Weisner’s theorem). Let L be a finite lattice with at least two elements. Then for
every a ∈ L \ {1̂} we have the equation X
µ(x, 1̂) = 0. (2.8)
x∈L:
x∧a=0̂
Proof. We work in Möb(L) and calculate aε1̂ in two ways. On the one hand
X
aε = 1̂ εb ε = 0. 1̂
b≤a
Now taking the coefficient of 0̂ on both sides gives (2.8), and (2.9) follows immediately.
Example 2.4.5 (The Möbius function of the partition lattice Πn ). Let a = 1|23 · · · n ∈ Πn . Then
the partitions x that show up in the sum of (2.9) are just the atoms whose non-singleton block is {1, i} for
some i > 1. For each such x, the interval [x, 1̂] ⊆ Πn is isomorphic to Πn−1 , so (2.9) gives
µ(Πn ) = − (n − 1)µ(Πn−1 )
from which it follows by induction that
µ(Πn ) = (−1)n−1 (n − 1)!.
(Wasn’t that easy?) J
45
Example 2.4.6 (The Möbius function of the subspace lattice Ln (q)). Let L = Ln (q), and let
A = {(v1 , . . . , vn ) ∈ Fnq : vn = 0}. Then dim A = n − 1, i.e., A is a coatom in L. If X is a nonzero subspace
such that X ∩ A = 0, then X must be a line spanned by some vector (u1 , . . . , un ) with un 6= 0. We may as
well assume un = 1 and choose u1 , . . . , un−1 arbitrarily, so there are q n−1 such lines. Moreover, the interval
[X, 1̂] ⊆ L is isomorphic to Ln−1 (q). Therefore (2.9) gives
and by induction
n
µ(Ln (q)) = (−1)n q ( 2 ) .
J
Proof. It is sufficient to prove that (−1)r(L) µ(L) ≥ 0, since every interval in a USM lattice is USM.
Let a ∈ L \ {0̂}. Applying Weisner’s theorem to L∗ and using the fact that µ(P ) = µ(P ∗ ) (Corollary 2.2.10),
we see that X
µ(0̂, x) = 0. (2.10)
x∈L: x∨a=1̂
Now, suppose L is USM of rank n. The theorem is certainly true if n ≤ 1, so we proceed by induction.
Take a to be an atom. If x ∨ a = 1̂, then
so either x = 1̂, or else x is a coatom whose meet with a is 0̂. Therefore, we can solve for µ(0̂, 1̂) in (2.10)
to get X
µ(0̂, 1̂) = − µ(0̂, x).
coatoms x: x∧a=1̂
But each interval [0̂, x] is itself a USM lattice of rank n − 1, so by induction each summand has sign (−1)n−1 ,
which completes the proof.
A drawback of Weisner’s theorem is that it is still recursive; the right-hand side of (2.9) involves other values
of the Möbius function. This is not a problem for integer-indexed families of lattices {Ln } such that every
rank-k element x ∈ Ln has [0̂, x] ∼
= Lk (as we have just seen), but this is too much to hope for in general.
The next result, Rota’s crosscut theorem, gives a non-recursive way of computing the Möbius function.
Definition 2.4.8. Let L be a lattice. An upper crosscut of L is a set X ⊆ L\{1̂} such that if y ∈ L\X\{1̂},
then y < x for some x ∈ X. A lower crosscut of L is a set X ⊆ L \ {0̂} such that if y ∈ L \ X \ {0̂}, then
y > x for some x ∈ X.
It would be simpler to define an upper (resp., lower) crosscut as a set that contains all coatoms (resp.,
atoms), but in practice the formulation in the previous definition is typically a convenient way to show that
a particular set is a crosscut.
46
Theorem 2.4.9 (Rota’s crosscut theorem). Let L be a finite lattice and X ⊂ L an upper crosscut. Then
X
µ(L) = (−1)|A| . (2.11a)
V
A⊆X: A=0̂
Proof. We prove only (2.11a); the proof of (2.11b) is dual. Fix x ∈ L and start with the following equation
in Möb(L) (recalling (2.7)): X X X
1̂ − x = εy − εy = εy .
y∈L y≤x y6≤x
where Y = {y ∈ L : y 6≤ x for all x ∈ X}. (Expand the sum and recall that εy εy0 = δyy0 εy .) But if X is an
upper crosscut, then Y = {1̂}, and this last equation becomes
Y X
(1̂ − x) = ε1̂ = µ(y, 1̂)y. (2.12)
x∈X y∈L
Now equating the coefficients of 0̂ on the right-hand sides of (2.12) and (2.13) yields (2.11a).
Corollary 2.4.10 (Some Möbius functions are boring). Let L be a lattice in which 1̂ is not a join of
atoms (for example, a distributive lattice that is not Boolean, or almost any principal order ideal in Young’s
lattice). Then µ(L) = 0.
The crosscut theorem will be useful in studying hyperplane arrangements. Another topological application
is the following result due to J. Folkman (1966), whose proof (omitted) uses the crosscut theorem.
Theorem 2.4.11. Let L be a geometric lattice of rank r, and let P = L \ {0̂, 1̂}. Then
(
∼ Z|µ(L)| if i = r − 2,
H̃i (∆(P ), Z) =
0 otherwise
where H̃i denotes reduced simplicial homology. That is, ∆(P ) has the homology type of the wedge of µ(L)
spheres of dimension r − 2.
2.5 Exercises
Exercise 2.1. Let P be a locally finite poset. Consider the incidence function κ ∈ I(P ) defined by
(
1 if x l y,
κ(x, y) =
0 otherwise.
47
(a) Give a combinatorial interpretation of κn (x, y) for all x, y ∈ P and n ∈ N.
(b) How can you tell from κ and its convolution powers whether P is ranked?
(c) Give combinatorial interpretations of κ ∗ ζ(x, y) and ζ ∗ κ(x, y).
Exercise 2.2. Prove that the Möbius function is multiplicative on direct products (i.e., µP ×Q = µP µQ in
the notation of Proposition 2.1.4) directly from the definition of µ.
Exercise 2.3. Let P be a finite bounded poset and let P ∗ be its dual; recall that this means that x ≤P y
if and only if y ≤P ∗ x. Consider the vector space map F : I(P ) → I(P ∗ ) given by F (α)(y, x) = α(x, y).
(a) Show that F is an anti-isomorphism of algebras, i.e., it is a vector space isomorphism and F (α ∗ β) =
F (β) ∗ F (α).
(b) Show that F (δP ) = δP ∗ and F (ζP ) = ζP ∗ . Conclude that F (µP ) = µP ∗ and therefore that µ(P ) =
µ(P ∗ ).
Exercise 2.4. A set partition in Πn is a noncrossing partition (NCP) if its associated equivalence relation
∼ satisfies the following condition: for all i < j < k < `, if i ∼ k and j ∼ ` then i ∼ j ∼ k ∼ `. The set of all
NCPs of order n is denoted NCn . Ordering by reverse refinement makes NCn into a subposet of the partition
lattice Πn . Note that NCn = Πn for n ≤ 3 (the smallest partition that is not noncrossing is 13|24 ∈ Π4 ).
NCPs can be represented pictorially by chord diagrams. The chord diagram of ξ = 1|2 5|3|4|6 8 12|7|9|10 11 ∈
NC12 is shown in Figure 2.1(a).
12 11’ 12 12’
(a) 11 1 (b) 11 1
10’ 1’
10 2 10 2
9’ 2’
9 3 9 3
8’ 3’
8 4 8 4
7’ 4’
7 5 7 5
6 6’ 6 5’
1 2n
therefore, ncn is the nth Catalan number Cn = n+1 n .
(c) Prove that the operation of Kreweras complementation is an anti-automorphism of NCn . To define
the Kreweras complement K(π) of π ∈ NCn , start with the chord diagram of π and insert a point
labeled i0 between the points i and i + 1 (mod n) for i = 1, 2, . . . , n. Then a, b lie in the same block
of K(π) if it is possible to walk from a0 to b0 without crossing an arc of π. For instance, the Kreweras
complement of the noncrossing partition ξ ∈ NC12 shown above is K(ξ) = 1 5 12|2 3 4|6 7|8 9 11|10
(see Figure 2.1(b)).
(d) Use Weisner’s theorem to prove that µ(NCn ) = (−1)n−1 Cn−1 for all n ≥ 1.
The characteristic polynomial of NCn satisfies a version of the Catalan recurrence. For details see [LS00]
(this might make a good end-of-semester project).
48
Exercise 2.5. This problem is about how far Proposition 2.4.2 can be extended. Suppose that R is a
commutative C-algebra of finite dimension n as a C-vector space, and that x1 , . . . , xn ∈ R are linearly
independent idempotents (i.e., x2i = xi for all i). Prove that R ∼
= Cn as rings.
Exercise 2.6. The q-binomial coefficient is the rational function
(q n − 1)(q n − q) · · · (q n − q k−1 )
n
= .
k q (q k − 1)(q k − q) · · · (q k − q k−1 )
(a) Check that setting q = 1 (after canceling out common terms), or equivalently applying limq→1 , recovers
the ordinary binomial coefficient nk .
(c) (Stanley, EC1, 2nd ed., 3.119) Prove the q-binomial theorem:
n−1 n
n k
(−1)k q (2) xn−k .
Y X
(x − q k ) =
k q
k=0 k=0
(Hint: Let V = Fnq and let X be a vector space over Fq with x elements. Count the number of
one-to-one linear transformations V → X in two ways.)
Exercise 2.7. (Stanley, EC1, 3.129) Here is a cute application of combinatorics to elementary number
theory. Let P be a finite poset, and let P̂ = P ∪ {0̂, 1̂}. Suppose that P has a fixed-point-free automorphism
σ : P → P of prime order p; that is, σ(x) 6= x and σ p (x) = x for all x ∈ P . Prove that µP̂ (0̂, 1̂) ≡ −1
(mod p). What does this say in the case that P̂ = Πp ?
49
Chapter 3
Matroids
The motivating example of a geometric lattice is the lattice of flats of a finite set E of vectors. The underlying
combinatorial data of this lattice can be expressed in terms of the rank function, which says the dimension
of the space spanned by every subset of E. However, there are many other equivalent ways to describe
the “combinatorial linear algebra” of a set of vectors: the family of linearly independent sets; the family of
sets that form bases; which vectors lie in the span of which sets; etc. Each of these data sets defines the
structure of a matroid on E. Matroids can also be regarded as generalizations of graphs, and are important
in combinatorial optimization as well. A standard reference on matroid theory is [Oxl92], although I first
learned the basics of the subject from an unusual (but very good) source, namely chapter 3 of [GSS93].
Conventions: Unless otherwise specified, E always denotes a finite set. We will be doing a lot of adding
elements to and removing elements e from sets A, so for convenience we define A + e = A ∪ {e} and
A − e = A \ {e}.
Definition 3.1.1. Let E be a finite set. A closure operator on E is a map 2E → 2E , written A 7→ Ā,
such that
(i) A ⊆ Ā;
(ii)  = ; and
(iii) if A ⊆ B, then Ā ⊆ B̄
for all A, B ⊆ E. A set A is called closed or a flat if Ā = A. A matroid closure operator is a closure
operator that satisfies in addition
if e 6∈ Ā but e ∈ A + e0 , then e0 ∈ A + e. (3.1)
A matroid M is a set E (the “ground set”) together with a matroid closure operator on E. A matroid is
simple if the empty set and all singleton sets are closed.
For any closure operator (not necessarily matroidal), any two subsets A, B ⊆ E satisfy A ∩ B ⊆ A and
A ∩ B ⊆ B (by (iii)), hence A ∩ B ⊆ A ∩ B. In particular, if F and G are flats, then
F ∩G⊆F ∩G=F ∩G (3.2)
50
so equality holds. That is, the intersection of flats is a flat.
Example 3.1.2. Vector matroids. Let V be a vector space over a field k, and let E ⊆ V be a finite set.
Then
A 7→ Ā := kA ∩ E
is a matroid closure operator on E. It is easy to check the conditions for a closure operator. To check
condition (3.1), if e ∈ A + e0 , then there is a linear equation
X
e = ce0 e 0 + ca a
a∈A
where ce0 and all the ca are scalars in k. The condition e 6∈ Ā implies that ce0 6= 0 in any equation of this
form. Therefore, the equation can be rewritten to express f 0 as a linear combination of the vectors in A + e,
obtaining (3.1). A matroid arising in this way (or, more generally, isomorphic to such a matroid) is called a
vector matroid, linear matroid or representable matroid.1 J
A vector matroid records information about linear dependence (i.e., which vectors belong to the linear
spans of other sets of vectors) without having to worry about the actual coordinates of the vectors. More
generally, a matroid can be thought of as a combinatorial, coordinate-free abstraction of linear dependence
and independence. Note that a vector matroid is simple if none of the vectors is zero (so that ¯∅ = ∅) and if
no vector is a scalar multiple of another (so that all singleton sets are closed).
The following theorem says that simple matroids and geometric lattices are essentially the same things. In
Rota’s language, they are “cryptomorphic”: their definitions look very different, but they carry the same
information. We will see many more ways to axiomatize the same information: rank functions, indepen-
dence systems, basis systems, etc. Working with matroids theory requires a solid level of comfort with the
cryptomorphisms between these various definitions of a matroid.
Theorem 3.2.1. 1. Let M be a simple matroid with finite ground set E. Let L(M ) be the poset of flats of M ,
ordered by inclusion. Then L(M ) is a geometric lattice, under the operations F ∧G = F ∩G, F ∨G = F ∪ G.
W
2. Let L be a geometric lattice and let E be its set of atoms. Then the function A 7→ A = {e ∈ E : e ≤ A}
is a matroid closure operator for a simple matroid on E.
First, we show that L(M ) is an atomic lattice. The intersection of flats is a flat by (3.2), so the operation
F ∧ G = F ∩ G makes L(M ) into a meet-semilattice. It’s bounded (with 0̂ = ∅ and 1̂ = E), so it’s a lattice
by Proposition 1.2.9. By the way, the join operation is F ∨ G = F ∪ G, which is by definition the smallest
flat containing F ∪ G, so it is the meet of all flats containing both F and G. (Note that this argument shows
that any closure operator, not necessarily matroidal, gives rise to a lattice.)
By definition of a simple matroid, the singleton subsets of E are atoms in L(M ). Every flat is the join of
the atoms corresponding to its elements, so L(M ) is atomic.
1 Usually one of the first two terms is used for a matroid defined by a set of vectors; “representable” suggests that the matroid
51
At this point we prove a useful lemma about covering relations in L(M )
Lemma 3.2.2. If F ∈ L(M ) and e ∈ E \ F (so that F < F ∨ e), then in fact F l F ∨ e.
F ( G ⊆ F ∨ e = F + e. (3.3)
Let e0 ∈ G \ F . Then e0 ∈ F + e, so the exchange axiom (3.1) implies e ∈ F + e0 , which in turn implies that
F ∨ e ⊆ F ∨ e0 ⊆ G. Hence the ⊆ in (3.3) is actually an equality. We have shown that there are no flats
strictly between F and F ∨ e, proving the claim.
Of course, if F l G then G = F ∨ e for any e ∈ G \ F . So we have essentially characterized all the covering
relations in L(M ), which is very useful.
Suppose now that F and G are incomparable and that G m F ∧ G. Then G is of the form (F ∧ G) ∨ e, and
we can take e to be any element of G \ F . In particular F < F ∨ e, so by Lemma 3.2.2, F l F ∨ e. Moreover,
F ∨ G = F ∨ ((F ∧ G) ∨ e) = (F ∨ e) ∨ (F ∧ G) = F ∨ e.
We have just proved that L(M ) is semimodular. Here is the diamond picture (cf. (1.6)):
F ∨G=F ∨e
•
F G = (F ∧ G) ∨ e
•
F ∧G
Recall that if L is semimodular and x, e ∈ L with e an atom and x 6≥ e (so that x < x ∨ e), then in fact
x l x ∨ e, because
r(x ∨ e) − r(x) ≤ r(e) − r(x ∧ e) = 1 − 0 = 1.
Accordingly,
W let A ⊆ E and let e, f ∈ E \ A. Suppose that e ∈ A + f ; we must show that f ∈ A + e. Let
x = A ∈ L. Then
52
In view of Theorem 3.2.1, we can describe a matroid on ground set E by the function A 7→ r(Ā), where r
is the rank function of the associated geometric lattice. It is standard to abuse notation by calling this
function r as well. Formally:
Definition 3.2.3. A matroid rank function on E is a function r : 2E → N satisfying the following conditions
for all A, B ⊆ E:
If r is a matroid rank function on E, then the corresponding matroid closure operator is given by
A = {e ∈ E : r(A + e) = r(A)}.
Moreover, this matroid is simple if and only if r(A) = |A| whenever |A| ≤ 2.
Conversely, if A 7→ Ā is a matroid closure operator on E, then the corresponding matroid rank function r is
It is easy to check that this satisfies the conditions of Definition 3.2.3. The corresponding matroid is called
the uniform matroid Uk (n). Its closure operator is
(
A if |A| < k,
A=
E if |A| ≥ k.
So the flats of M are the sets of cardinality < k, as well as E itself. Therefore, the lattice of flats looks like
a Boolean algebra 2[n] that has been truncated at the kth rank: that is, all elements of rank ≥ k have been
deleted and replaced with a single 1̂. For n = 3 and k = 2, this lattice is M5 . For n = 4 and k = 3, the
Hasse diagram is as shown below.
1234
12 13 14 23 24 34
1 2 3 4
If E is a set of n vectors in general position in kk , then the corresponding linear matroid is isomorphic to
Uk (n). This sentence is tautological, in the sense that it can be taken as a definition of “general position”.
If k is infinite and the points are chosen randomly (in some reasonable measure-theoretic sense), then L(E)
will be isomorphic to Uk (n) with probability 1. On the other hand, k must be sufficiently large (in terms
of n) in order for kk to have n points in general position: for instance, U2 (4) cannot be represented as a
matroid over F2 simply because F22 contains only three nonzero vectors. J
53
At this point, let us formalize what isomorphism of matroids means.
Definition 3.2.5. Let M, M 0 be matroids on ground sets E, E 0 respectively. We say that M and M 0 are
isomorphic, written M ∼
= M 0 , if there is a bijection f : E → E 0 meeting any (hence all) of the following:
In general, every definition of “matroid” (and there are several more coming) will induce a corresponding
notion of “isomorphic”.
Let G be a finite graph with vertices V and edges E. For convenience, we will write e = xy to mean “e is
an edge with endpoints x, y”. This notation does not exclude the possibility that e is a loop (i.e., x = y) or
that some other edge might have the same pair of endpoints.
Definition 3.3.1. For each subset A ⊆ E, the corresponding induced subgraph of G is the graph G|A
with vertices V and edges A. The graphic matroid or complete connectivity matroid M (G) on E is
defined by the closure operator
Equivalently, an edge e = xy belongs to Ā if there is a path between x and y consisting of edges in A (for
short, an A-path). For example, in the graph, 14 ∈ Ā because {12, 24} ⊆ A.
1 1 1
2 3 2 3 2 3
4 5 4 5 4 5
G = (V, E) A Ā
Proof. It is easy to check that A ⊆ Ā for all A, and that A ⊆ B =⇒ Ā ⊆ B̄. If e = xy ∈ , then x, y can
be joined by an Ā-path P , and each edge in P can be replaced with an A-path, giving an A-path between
x and y.
54
A
e P f
Figure 3.1: The matroid closure axiom for a graphic matroid. Note that A consists of all edges shown except
e and f .
Such a subset B is called a spanning forest 2 of A (or of G|A ). They are the bases of the graphic matroid M (G).
(I haven’t yet said what a basis is — see the next section.)
Theorem 3.3.3. Let B ⊆ A. Then any two of the following conditions imply the third (and characterize
spanning forests of A):
1. r(B) = r(A);
2. B is acyclic;
3. |B| = |V | − c, where c is the number of connected components of A.
The flats of M (G) correspond to the subgraphs of G in which every component is an induced subgraph
of G. In other words, the geometric lattice corresponding to the graphic matroid M (G) is precisely the
connectivity lattice K(G) introduced in Example 1.2.3.
Example 3.3.4. If G is a forest (a graph with no cycles), then no two vertices are joined by more than one
path. Therefore, every edge set is a flat, and M (G) ∼
= Un (n). J
Example 3.3.5. If G is a cycle of length n, then every edge set of size < n − 1 is a flat, but the closure of
a set of size n − 1 is the entire edge set. Therefore, M (G) ∼
= Un−1 (n). J
Example 3.3.6. If G = Kn (the complete graph on n vertices), then a flat of M (G) is the same thing as
an equivalence relation on [n]. Therefore, M (Kn ) is naturally isomorphic to the partition lattice Πn . J
In addition to rank functions, lattices of flats, and closure operators, there are many other equivalent ways
to define a matroid on a finite ground set E. In the fundamental example of a linear matroid M , some of
these definitions correspond to linear-algebraic notions such as linear independence and bases.
Definition 3.4.1. A (matroid) independence system I is a family of subsets of E such that
(I1) ∅ ∈ I ;
(I2) if I ∈ I and I 0 ⊆ I, then I 0 ∈ I ;
(I3) (“Donation”) if I, J ∈ I and |I| < |J|, then there is some x ∈ J \ I such that I ∪ x ∈ I .
2 This terminology can cause confusion. By definition a subgraph H of G is spanning if V (H) = V (G), but not every acyclic
spanning subgraph is a spanning forest. A more accurate term would be “maximal forest”.
55
Note that conditions (I1) and (I2) say that I is an abstract simplicial complex on E (see Example 1.1.11).
If E is a finite subset of a vector space, then the linearly independent subsets of E form a matroid indepen-
dence system. Conditions (I1) and (I2) are clear. For (I3), the span of J has greater dimension than that
of I, so there must be some x ∈ J outside the span of I, and then I ∪ x is linearly independent.
The next lemma generalizes the statement that any linearly independent set of vectors can be extended to
a basis of any space containing it.
Lemma 3.4.2. Let I be a matroid independence system on E. Suppose that I ∈ I and I ⊆ X ⊆ E. Then
I can be extended to a maximum independent subset of X.
Proof. If I already has maximum cardinality then we are done. Otherwise, let J be a maximum independent
subset of X. Then |J| > |I|, so by (I3) there is some x ∈ J \ I with I ∪ x independent. Replace I with I ∪ x
and repeat.
The argument shows also that for every X ⊆ E, all maximal independent subsets (or bases) of X have
the same cardinality (so there is no irksome difference between “maximal” and “maximum”). In simplicial
complex terms, every induced subcomplex of I is pure — an induced subcomplex is something of the form
I |X = {I ∈ I : I ⊆ X}, for X ⊆ E, and “pure” means that all maximal faces have the same cardinality.
This condition actually characterizes matroid independence complexes; we will take this up again in §6.5.
A matroid independence system records the same combinatorial structure on E as a matroid rank function:
Proposition 3.4.3. Let E be a finite set.
Proof. Part 1: Let r be a matroid rank function on E and define I as in (3.5a). First, r(I) ≤ |I| for
all I ⊆ E, so (I1) follows immediately. Second, suppose I ∈ I and I 0 ⊆ I; say I 0 = {x1 , . . . , xk } and
I = {x1 , . . . , xn }. Consider the “flag” (nested family of subsets)
∅ ( {x1 } ( {x1 , x2 } ( · · · ( I 0 ( · · · ( I.
The rank starts at 0 and increases at most 1 each time by submodularity. But since r(I) = |I|, it must
increase by exactly 1 each time. In particular r(I 0 ) = k = |I 0 | and so I 0 ∈ I , establishing (I2).
To show (I3), let I, J ∈ I with |I| < |J| and let J \ I = {x1 , . . . , xn }. If n = 1 then J = I + x1 and there
is nothing to show. Now suppose that n ≥ 2 and r(I + xk ) = r(I) for every k ∈ [n]. By submodularity,
r(I + x1 + x2 ) ≤ r(I + x1 ) + r(I + x2 ) − r(I) = r(I),
r(I + x1 + x2 + x3 ) ≤ r(I + x1 + x2 ) + r(I + x3 ) − r(I) = r(I),
···
r(I + x1 + x2 + · · · + xn ) ≤ r(I + x1 + · · · + xn−1 ) + r(I + xn ) − r(I) = r(I),
56
and equality must hold throughout. But then r(I ∪ J) = r(I) < r(J), which is a contradiction.
Part 2: Now suppose that I is an independence system on E, and define a function r : 2E → Z as in (3.5b).
It is immediate from the definition that r(A) ≤ |A| and that A ⊆ B implies r(A) ≤ r(B) for all A, B ∈ I .
To prove submodularity, let A, B ⊆ E and let I be a basis of A ∩ B. By Lemma 3.4.2, we can extend I to
a basis J of A ∪ B. Note that no element of J \ I can belong to both A and B, otherwise I would not be a
maximal independent set in A ∩ B. So we have the following Venn diagram:
A B
I
J
Moreover, J ∩ A and J ∩ B are independent subsets of A and B respectively, but not necessarily maximal,
so
r(A ∪ B) + r(A ∩ B) = |I| + |J| = |J ∩ A| + |J ∩ B| ≤ r(A) + r(B).
If M = M (G) is a graphic matroid, the associated independence system I is the family of acyclic edge sets
in G. To see this, notice that if A is a set of edges and e ∈ A, then r(A − e) < r(A) if and only if deleting e
breaks a component of G|A into two smaller components (so that in fact r(A − e) = r(A) − 1). This is
equivalent to the condition that e belongs to no cycle in A. Therefore, if A is acyclic, then deleting its edges
one by one gets you down to ∅ and decrements the rank each time, so r(A) = |A|. On the other hand, if A
contains a cycle, then deleting any of its edges won’t change the rank, so r(A) < |A|.
Here’s what the “donation” condition (I3) means in the graphic setting. Suppose that |V | = n, and let c(H)
denote the number of components of a graph H. If I, J are acyclic edge sets with |I| < |J|, then
and there must be some edge e ∈ J whose endpoints belong to different components of G|I ; that is, I + e is
acyclic.
The bases of M (the maximal independent sets) provide another way of defining a matroid.
Definition 3.4.4. A (matroid) basis system on E is a nonempty family B ⊆ 2E such that for all
B, B 0 ∈ B,
(B1) |B| = |B 0 |;
(B2) For all e ∈ B \ B 0 , there exists e0 ∈ B 0 \ B such that (B − e) + e0 ∈ B;
(B20 ) For all e ∈ B \ B 0 , there exists e0 ∈ B 0 \ B such that (B 0 + e) − e0 ∈ B.
In fact, given (B1), the conditions (B2) and (B20 ) are equivalent, although this require some proof (Exer-
cise 3.2).
For example, if S is a finite set of vectors spanning a vector space V , then the subsets of S that are bases
for V all have the same cardinality (namely dim V ) and satisfy the basis exchange condition (B2).
57
If G is a graph, then the bases of M (G) are its spanning forests, i.e., its maximal acyclic edge sets. If G
is connected (which, as we will see, we may as well assume when studying graphic matroids) then the bases
of M (G) are its spanning trees.
B B0
Here is the graph-theoretic interpretation of (B2). Let G be a connected graph, let B, B 0 be spanning trees,
and let e ∈ B \ B 0 . Then B − e has exactly two connected components. Since B 0 is connected, it must have
some edge e0 with one endpoint in each of those components, and then B − e + e0 is a spanning tree. See
Figure 3.2.
B B\e B0
Figure 3.2: An example of basis axiom (B2) in a graphic matroid. The green edges are the possibilities
for e0 such that B\e + e0 is a spanning tree.
As for (B20 ), if e ∈ B \ B 0 , then B 0 + e must contain a unique cycle C (formed by e together with the unique
path P in B 0 between the endpoints of e). Deleting any edge e0 ∈ P will produce a spanning tree, and there
must be at least one such edge e0 6∈ B (otherwise B contains the cycle C). See Figure 3.3.
∗ ∗
B B0 B0 ∪ e
Figure 3.3: An example of basis axiom (B20 ) in a graphic matroid. The path P is shown in green. The
edges of P \ B, marked with stars, are valid choices for e0 .
58
If G is a graph with edge set E and M = M (G) is its graphic matroid, then
I = {A ⊆ E : A is acyclic},
B = {A ⊆ E : A is a spanning forest of G}.
1. If I is an independence system onSE, then the family of maximal elements of I is a basis system.
2. If B is a basis system, then I = B∈B 2B is an independence system.
3. These constructions are mutual inverses.
The proof is left as an exercise. We already have seen that an independence system on E is equivalent to
a matroid rank function; Proposition 3.4.5 asserts that a basis system provides the same structure on E.
Bases turn out to be especially convenient for describing fundamental operations on matroids such as duality,
direct sum, and deletion/contraction (all of which are coming soon).
Instead of specifying the bases (maximal independent sets), a matroid can be defined by its minimal depen-
dent sets, which are called circuits. These too can be axiomatized:
Definition 3.4.6. A (matroid) circuit system on E is a family C ⊆ 2E such that, for all C, C 0 ∈ C ,
(C1) ∅ 6∈ C :
(C2) C ⊆6 C 0;
(C3) For all e ∈ C ∩ C 0 , the set (C ∪ C 0 ) − e contains an element of C .
In a linear matroid, the circuits are the minimal dependent sets of vectors. Indeed, if C, C 0 are such sets and
e ∈ C ∩ C 0 , then we can find two expressions for e as nontrivial linear combinations of vectors in C and in
C 0 , and equating these expressions and eliminating e shows that (C ∪ C 0 ) − e is dependent, hence contains
a circuit.
In a graph, if two cycles C, C 0 meet in a (non-loop) edge e = xy, then C − e and C 0 − e are paths between x
and y, so concatenating them forms a closed path. This path is not necessarily itself a cycle, but must
contain some cycle.
Proposition 3.4.7. Let E be a finite set.
In other words, the circuits are the minimal nonfaces of the independence complex (hence they correspond
to the generators of the Stanley-Reisner ideal; see Defn. 6.3.1). The proof is left as an exercise.
The final definition of a matroid is different from what has come before, and gives a taste of the importance
of matroids in combinatorial optimization.
Let E be a finite set and let ∆ be an abstract simplicial complex on E (see Definition 3.4.1). Let w :
E → R≥0 be a function, which we regard as assigning weights to the elements of E, and for A ⊆ E, define
59
P
w(A) = e∈A w(e). Consider the problem of maximizing w(A) over all subsets A ∈ ∆; the maximum will
certainly be achieved on a facet. A naive approach to find a maximal-weight A, which may or may not work
for a given ∆ and w, is the following “greedy” algorithm (known as Kruskal’s algorithm):
1. Let A = ∅.
2. If A is a facet of ∆, stop.
Otherwise, find e ∈ E \ A of maximal weight such that A + e ∈ ∆ (if there are several such e, pick one
at random), and replace A with A + e.
3. Repeat step 2 until A is a facet of ∆.
Proposition 3.4.8. ∆ is a matroid independence system if and only if Kruskal’s algorithm produces a facet
of maximal weight for every weight function w.
The proof is left as an exercise, as is the construction of a simplicial complex and a weight function for
which the greedy algorithm does not produce a facet of maximal weight. This interpretation can be useful
in algebraic combinatorics; see Example 9.19.2 below.
The motivating example of a matroid is a finite collection of vectors in Rn . What if we work over a different
field? What if we turn this question on its head by specifying a matroid M purely combinatorially and then
asking which fields give rise to vector sets whose matroid is M ?
Definition 3.5.1. Let M be a matroid and V a vector space over a field k. A set of vectors S ⊆ V
represents or realizes M over k if the linear matroid M (S) associated with S is isomorphic to M .
For example:
• The matroid U2 (3) is representable over any field F. Set S = {(1, 0), (0, 1), (1, 1)}; any two of these
vectors form a basis of F2 .
• If k has at least three elements, then U2 (4) is representable, by, e.g., S = {(1, 0), (0, 1), (1, 1), (1, a)}.
where a ∈ k \ {0, 1}. Again, any two of these vectors form a basis of k2 .
60
• On the other hand, U2 (4) is not representable over F2 , because F22 doesn’t contain four nonzero elements.
More generally, suppose that M is a simple matroid with n elements (i.e., the ground set E has |E| = n) and
rank r (i.e., every basis of M has size r) that is representable over the finite field Fq of order q. Then each
element of E must be represented by some nonzero vector in Fnq , and no two vectors can be scalar multiples
of each other. Therefore,
qr − 1
n≤ .
q−1
Example 3.5.2. The Fano plane. Consider the affine point configuration with 7 points and 7 lines (one of
which looks like a circle), as shown:
6 4
7
1 3
2
This point configuration cannot be represented over R. If you try to draw seven non-collinear points in R2
such that the six triples 123, 345, 156, 147, 257, 367 are each collinear, then 246 will not be collinear —
try it. The same thing will happen over any field of characteristic 6= 2. On the other hand, over a field of
characteristic 2, if the first six triples are collinear then 246 must be collinear. The configuration can be
explicitly represented over F2 by the columns of the matrix
1 1 0 0 0 1 1
0 1 1 1 0 0 1 ∈ (F2 )3×7
0 0 0 1 1 1 1
for which each of the seven triples of columns listed above is linearly dependent, and that each other triple
is a column basis. (Note that over R, the submatrix consisting of columns 2,4,6 has determinant 2.) The
resulting matroid is called the Fano plane or Fano matroid. Note that each line in the Fano matroid
corresponds to a 2-dimensional subspace of F32 .
Viewed as a matroid, the Fano plane has rank 3. Its bases are the 73 − 7 = 28 noncollinear triples of points.
Its circuits are the seven collinear triples and their complements (known as ovals). For instance, 4567 is an
oval: it is too big to be independent, but on the other hand every three-element subset of it forms a basis
(in particular, is independent), so it is a circuit.
The Fano plane is self-dual in the sense of discrete geometry3 : the lines can be labeled 1, . . . , 7 so that point i
lies on line j if and only if point j lies on line i. Here’s how: recall that the points and lines of the Fano plane
correspond respectively to 1- and 2-dimensional subspaces of F32 , and assign the same label to orthogonally
complementary spaces under the standard inner product. J
Example 3.5.3 (Finite projective planes). Let q ≥ 2 be a positive integer. A projective plane of order
q consists of a collection P of points and a collection L of lines, each of which is a subset of P , such that:
3 But not self-dual as a matroid in the sense to be defined in §3.7.
61
• |P | = |L| = q 2 + q + 1;
• Each line contains q + 1 points, and each point lies in q + 1 lines;
• Any two points determine a unique line, and any two lines determine a unique point.
The Fano plane is thus a projective plane of order 2. More generally, if Fq is any finite field, then one can define
a projective plane P2q whose points and lines are the 1- and 2-dimensional subspaces F3q , respectively. Note
that the number of lines is the number of nonzero vectors up to scalar multiplication, hence (q 3 −1)/(q −1) =
q 2 + q + 1.
A notorious open question is whether any other finite projective planes exist. The best general result known
is the Bruck–Ryser–Chowla theorem (1949), which states that if q ≡ 1 or 2 (mod 4), then q must be the
sum of two squares. In particular, there exists no projective plane of order 6. Order 10 is also known to be
impossible thanks to computer calculation, but the problem is open for other non-prime-power orders. It is
also open whether there exists a projective plane of prime-power order that is not isomorphic to P2q . One
readily available survey of the subject is by Perrott [Per16]. J
Representability can be tricky. As we have seen, U2 (4) can be represented over any field other than F2 ,
while the Fano plane is representable only over fields of characteristic 2. The point configuration below is
an affine representation of a rank-3 matroid over R, but the matroid is not representable over Q [Grü03,
pp. 93–94]. Put simply, it is impossible to construct a set of points with rational coordinates and exactly
these collinearities.
A regular matroid is one that is representable over every field. (For instance, we will see that graphic
matroids are regular.) For some matroids, the choice of field matters. For example, every uniform matroid
is representable over every infinite field, but as we have seen before, Uk (n) can be represented over Fq only
if n ≤ (q k − 1)/(q − 1). (For example, U2 (4) is not representable over F2 .) However, this inequality does not
suffice for representability; as mentioned above, the Fano plane cannot be represented over, say, F101 .
Recall that a minor of a matrix is the determinant of some square submatrix of M . A matrix is called
totally unimodular if every minor is either 0, 1, or −1.
Theorem 3.5.4. A matroid M is regular if and only if it can be represented by the columns of a totally
unimodular matrix.
One direction is easy: if M has a unimodular representation then the coefficients can be interpreted as lying
in any field, and the linear dependence of a set of columns does not depend on the choice of field (because
−1 6= 0 and 1 6= 0 in every field). The reverse direction is harder (see [Oxl92, chapter 6]), and the proof is
omitted. In fact, something more is true: M is regular if and only if it is binary (representable over F2 ) and
representable over at least one field of characteristic 6= 2.
62
Theorem 3.5.5. Graphic matroids are regular.
Proof. Let G = (V, E) be a graph on vertex set V = [n], and let M = M (G) be the corresponding graphic
matroid. We can represent M by the matrix X whose columns are the vectors ei − ej for ij ∈ E. (Or ej − ei ;
it doesn’t matter, since scaling a vector does not change the matroid.) Here {e1 , . . . , en } is the standard
basis for Rn .
Consider any square submatrix XW B of X with rows W ⊆ V and columns B ⊆ A, where |W | = |B| = k > 0.
If B contains a cycle v1 , . . . , vk then the columns are linearly dependent, because
so det XW B = 0. On the other hand, if B is acyclic, then I claim that det XW B ∈ {0, ±1}, which we will
prove by induction on k. The base case k = 1 follows because all entries of X are 0 or ±1. For k > 1, if
there is some vertex of W with no incident edge in B, then the corresponding row of XW B is zero and the
determinant vanishes. Otherwise, by the handshaking theorem, there must be some vertex w ∈ W incident
to exactly one edge b ∈ B. The corresponding row of XW B will have one entry ±1 and the rest zero.
Expanding on that row gives det XW B = ± detW \w,B\b , and we are done by induction. The same argument
shows that any set of columns corresponding to an acyclic edge set will in fact be linearly independent.
Pappus’ Theorem from Euclidean geometry says that if a, b, c, A, B, C are distinct points in R2 such that
a, b, c and A, B, C are collinear, then x, y, z are collinear, where
c
b
a
y z
x
A
B
C
63
Accordingly, there is a rank-3 simple matroid on ground set E = {a, b, c, A, B, C, x, y, z} whose flats are
It turns out that deleting xyz from this list produces the family of closed sets of a matroid, called the
non-Pappus matroid NP. Since Pappus’ theorem can be proven using analytic geometry, and the equations
that say that x, y, z are collinear are valid over any field (i.e., involve only ±1 coefficients), it follows that
NP is not representable over any field. J
There are several ways to construct new matroids from old ones. We’ll begin with a boring but useful one
(direct sum) and then move on to the more exciting constructions of duality and deletion/contraction.
Definition 3.6.1. Let M1 , M2 be matroids on disjoint sets E1 , E2 , with basis systems B1 , B2 . The direct
sum M1 ⊕ M2 is the matroid on E1 ∪ E2 with basis system
B = {B1 ∪ B2 : B1 ∈ B1 , B2 ∈ B2 }.
If M1 , M2 are linear matroids whose ground sets span vector spaces V1 , V2 respectively, then M1 ⊕ M2 is the
matroid you get by regarding the vectors as living in V1 ⊕ V2 : the linear relations have to come either from
V1 or from V2 .
G1 G2 G
A useful corollary is that every graphic matroid arises from a connected graph. Actually, there may be many
different connected graphs that give rise to the same matroid, since in the previous construction it did not
matter which vertices of G1 and G2 were identified. This raises an interesting question: when does the
isomorphism type of a graphic matroid M (G) determine the graph G up to isomorphism?
Definition 3.6.2. A matroid that cannot be written nontrivially as a direct sum of two smaller matroids is
called connected or indecomposable.4
Proposition 3.6.3. Let G = (V, E) be a loopless graph. Then M (G) is indecomposable if and only if G is
2-connected — i.e., not only is it connected, but so is every subgraph obtained by deleting a single vertex.
4 The first term is more common among matroid theorists, but I prefer “indecomposable” to avoid potential confusion with
64
The “only if” direction is immediate: the discussion above implies that
M
M (G) = M (H)
H
Remark 3.6.4. If G ∼ = H as graphs, then clearly M (G) ∼ = M (H). The converse is not true: if T is any tree
(or even forest) on n vertices, then every set of edges is acyclic, so the independence complex is the Boolean
algebra 2[n] (and, for that matter, so is the lattice of flats).
In light of Proposition 3.6.3, it is natural to suspect that every 2-connected graph is determined up to
isomorphism by its graphic matroid, but even this is not true; the two 2-connected graphs below are not
isomorphic, but have isomorphic graphic matroids.
As you should expect from an operation called “direct sum,” properties of M1 ⊕M2 should be easily deducible
from those of its summands. In particular, direct sum is easy to describe in terms of the other matroid
axiomatizations we have studied. It is additive on rank functions: if A1 ⊆ E1 and A2 ⊆ E2 , then
Similarly, the closure operator is A1 ∪ A2 = A1 ∪ A2 . The circuit system of the direct sum is just the
(necessarily disjoint) union of the circuit systems of the summands. Finally, the geometric lattice of a direct
sum is just the poset product of the lattices of the summands, i.e.,
L(M1 ⊕ M2 ) ∼
= L(M1 ) × L(M2 ),
subject to the order relations (F1 , F2 ) ≤ (F10 , F20 ) iff Fi ≤ Fi0 in L(Mi ) for each i.
3.7 Duality
Definition 3.7.1. Let M be a matroid on ground set |E| with basis system B. The dual matroid of M
(also known as the orthogonal matroid) is the matroid M ∗ on E with basis system
B ∗ = {E \ B : B ∈ B}.
65
We often write e∗ for elements of the ground set when talking about their behavior in the dual matroid.
Clearly the elements of B ∗ all have cardinality |E| − r(M ) (where r is the rank), and complementation
swaps the basis exchange conditions (B2) and (B20 ), so if you believe that those conditions are logically
equivalent (Exercise 3.2) then you also believe that B ∗ is a matroid basis system.
It is immediate from the definition that (M ∗ )∗ = M . In addition, the independent sets of M are the
complements of the spanning sets of M ∗ (since A ⊆ B for some B ∈ B if and only if E \ A ⊇ E \ B), and
vice versa. The rank function r∗ of the dual is given by
The dual of a vector matroid has an explicit description. Let E = {v1 , . . . , vn } ⊆ kr , and let M = M (E).
We may as well assume that E spans kr , so r ≤ n, and the representing matrix X = [v1 | · · · |vn ] ∈ kr×n has
full row rank r.
Let Y be any (n − r) × n matrix with rowspace(Y ) = nullspace(X). That is, the rows of Y span the
orthogonal complement of rowspace(X) with respect to the standard inner product.
Theorem 3.7.2. With this setup, the columns of Y are a representation for M ∗ .
Before proving this theorem, we’ll do an example that will make things clearer.
Example 3.7.3. Let E = {v1 , . . . , v5 } be the set of column vectors of the following matrix (over R, say):
1 0 0 2 1
X = 0 1 0 2 1 .
0 0 1 0 0
Notice that X has full row rank (it’s in row-echelon form, after all), so it represents a matroid of rank 3 on
5 elements. We could take Y to be the matrix
0 0 0 1 −2
Y = .
1 1 0 0 −1
Then Y has rank 2. Call its columns {v1∗ , . . . , v5∗ }; then the column bases are
{v1∗ , v4∗ }, {v1∗ , v5∗ }, {v2∗ , v4∗ }, {v2∗ , v5∗ }, {v4∗ , v5∗ },
whose (unstarred) complements (e.g., {v2 , v3 , v5 }, etc.) are precisely the column bases for X. In particular,
every basis of M contains v3 (so v3 is a coloop), which corresponds to the fact that no basis of M ∗ contains
v3∗ (so v3∗ is a loop). This makes sense linear-algebraically: v3 is linearly independent of all the columns, so
no vector with a nonzero entry in the 3rd position is orthogonal to any row of M , so v3∗ is the zero vector. J
Proof of Theorem 3.7.2. First, note that invertible row operations on a matrix X ∈ kr×n (i.e., multiplication
on the left by an element of GLr (k)) do not change the matroid represented by its columns; they simply
change the basis of kr .
Let B be a basis of M , and reindex so that B = {v1 , . . . , vr }. We can then perform invertible row-operations
to put X into reduced row-echelon form, i.e.,
X = [Ir | A]
66
where Ir is the r×r identity matrix and A is arbitrary. It is easy to check that nullspace X = (rowspace X ∗ )T ,
where
X ∗ = [−AT | In−r ],
(this is a standard recipe). But then the last n − r elements of X ∗ , i.e., E ∗ \ B ∗ , are clearly a column basis.
By the same logic, every basis of X is the complement of a column basis of Y , and the converse is true
because X can be obtained from X ∗ in the same way that X ∗ is obtained from X. Therefore the columns
of X and X ∗ represent dual matroids. Meanwhile, any matrix Y with the same rowspace as X ∗ can be
obtained from it by invertible row operations, hence represents the same matroid.
Duality and graphic matroids. Let G be a connected planar graph, i.e., one that can be drawn in the
plane with no crossing edges. The planar dual is the graph G∗ whose vertices are the regions into which G
divides the plane, with two vertices of G∗ joined by an edge e∗ if the corresponding faces of G are separated
by an edge e of G. (So e∗ is drawn across e in the construction.)
f* G*
G
f
e*
e
If G is not planar then in fact M (G)∗ is not a graphic matroid (although it is certainly regular).
Definition 3.7.4. Let M be a matroid on E. A loop is an element of E that does not belongs to any basis
of M . A coloop is an element of E that belongs to every basis of M . An element of E that is neither a
loop nor a coloop is called ordinary (probably not standard terminology, but natural and useful).
In a linear matroid, a loop is a copy of the zero vector, while a coloop is a vector that is not in the span of
all the other vectors.
A cocircuit of M is by definition a circuit of the dual matroid M ∗ . A matroid can be described by its
cocircuit system, which satisfy the same axioms as those for circuits (Definition 3.4.6). Set-theoretically, a
cocircuit is a minimal set not contained in any basis of M ∗ , so it is a minimal set that intersects every basis
of M nontrivially. For a connected graph G, the cocircuits of the graphic matroid M (G) are the bonds
of G: the minimal edge sets K such that G − K is not connected. Every bond C ∗ is of the following form:
there is a partition V (G) = X ∪· Y such that C ∗ is the set of edges with one endpoint in each of X and Y ,
and both G|X and G|Y are connected.
67
3.8 Deletion and contraction
We can also describe deletion and contraction on the level of basis systems:
(
{B ∈ B(M ) : e 6∈ B} if e is not a coloop,
B(M \e) =
{B − e : B ∈ B(M )} if e is a coloop,
(
{B − e : B ∈ B(M ), e ∈ B} if e is not a loop,
B(M/e) =
{B : B ∈ B(M )} if e is a loop.
Again, the terms come from graph theory. Deleting an edge e of a graph G means removing it from the
graph, while contracting an edge means to shrink it down so that its two endpoints merge into one. The
resulting graphs are called G\e and G/e, and these operations are consistent with the effect on graphic
matroids, i.e.,
M (G\e) = M (G)\e, M (G/e) = M (G)/e. (3.7)
y y
v e z v z v y z
w x
x x
w w
G G−e G/e
Notice that contracting can cause some edges to become parallel, and can cause other edges (namely, those
parallel to the edge being contracted) to become loops. In matroid language, deleting an element from a
simple matroid always yields a simple matroid, but the same is not true for contraction.
We can define deletion and contraction of sets as well as single elements. To delete (resp., contract) a set,
simply delete (resp., contract) each of its elements in some order.
Proposition 3.8.2. Let M be a matroid on E.
1. For each A ⊆ E, the deletion M \A and contraction M/A are well-defined (i.e., do not depend on the
order in which elements of A are deleted or contracted).
2. In particular
I (M \A) = {I ⊆ E \ A : I ∈ I (M )},
I (M/A) = {I ⊆ E \ A : I ∪ B ∈ I (M )}
68
3. Deletion and contraction commute in the following sense: for every e, f ∈ E we have (M/e)\f ∼
=
(M \f )/e.
4. Deletion and contraction are interchanged by duality:
(M \e)∗ ∼
= M ∗ /e∗ and (M/e)∗ ∼
= M ∗ \e∗ .
Here is what deletion and contraction mean for vector matroids. Let V be a vector space over a field k, let
E ⊆ V be a set of vectors spanning V , let M = M (E), and let e ∈ E. Then:
1. M \e = M (E − e). (If we want to preserve the condition that the ground set spans the ambient space,
then e must not be a coloop.)
2. M/e is the matroid represented by the images of E − e in the quotient space V /ke. (Note that if e is
a loop then this quotient space is just V itself.)
Thus both operations preserve representability over k. For instance, to find an explicit representation of M/e,
apply a change of basis to V so that e is the ith standard basis vector, then simply erase the ith coordinate
of every vector in E − e.
Any matroid M 0 obtained from M by some sequence of deletions and contractions is called a minor of M .
Proposition 3.8.3. Every minor of a graphic (resp., linear, uniform) matroid is graphic (resp., linear,
uniform).
Proof. The graphic case follows from (3.7), and the linear case from the previous discussion. For uniform
matroids, the definitions imply that
Uk (n)\e ∼
= Uk (n − 1) and Uk (n)/e ∼
= Uk−1 (n − 1)
Many invariants of matroids can be expressed recursively in terms of deletion and contraction. The following
fact is immediate from Definition 3.8.1.
Proposition 3.8.4. Let M be a matroid on ground set E, and let b(M ) denote the number of bases of M .
Let e ∈ E; then
b(M \e)
if e is a loop;
b(M ) = b(M/e) if e is a coloop;
b(M \e) + b(M/e) otherwise.
Example 3.8.5. If M ∼ = Uk(n), then b(M ) = nk , and the recurrence of Proposition 3.8.4 is just the Pascal
Many other matroid invariants satisfy analogous recurrences involving deletion and contraction. In fact,
Proposition 3.8.4 is the tip of an iceberg that we will explore in Chapter 4.
69
3.9 Exercises
Exercise 3.1. Determine, with proof, all pairs of integers k ≤ n such that there exists a graph G with
M (G) ∼
= Uk (n). (Here Uk (n) denotes the uniform matroid of rank k on n elements; see Example 3.2.4.).
Hint: Use Proposition 3.8.3.
Exercise 3.2. Prove the equivalence of the two forms of the basis exchange condition (B2) and (B20 ).
(Hint: Examine |B \ B 0 |.)
Exercise 3.3. (Proposed by Kevin Adams.) Let B, B 0 be bases of a matroid M . Prove that there exists a
bijection φ : B \ B 0 → B 0 \ B such that B − e + φ(e) is a basis of M for every e ∈ B \ B 0 .
Exercise 3.4. Prove Proposition 3.4.5, which describes the cryptomorphism between matroid independence
systems and matroid basis systems.
Exercise 3.5. Prove Proposition 3.4.7, which describes the cryptomorphism between matroid independence
systems and matroid circuit systems. (Hint: The hardest part is showing that if C is a matroid circuit system
then the family I of sets containing no circuit satisfies (I3). Under the assumption that (I3) fails for some
pair I, J with |I| < |J|, use circuit exchange to build a sequence of collections of circuits in I ∪ J that avoid
more and more elements of I, eventually producing a circuit in J and thus producing a contradiction.)
Exercise 3.6. Let M be a matroid on ground set E. Suppose thereL is a partition of E into disjoint sets
n
E1 , . . . , En such that r(E) = r(E1 ) + · · · + r(Ek ). Prove that M = i=1 Mi , where Mi = M |Ei . (Note:
This fact provides an algorithm, albeit not necessarily an efficient one, for testing whether a matroid is
connected.)
Exercise 3.7. Let M be a matroid on ground set E with rank function r : 2E → N. Prove that the rank
function r∗ of the dual matroid M ∗ is given by r∗ (A) = r(E \ A) + |A| − r(E) for all A ⊆ E.
Exercise 3.8. Let M be a matroid on E. A set S ⊆ E is called spanning if it contains a basis. Let S be
the set of all spanning sets.
(a) Express S in terms of (i) the rank function r of M ; (ii) its closure operator A 7→ Ā; (iii) its lattice of
flats L. (You don’t have to prove anything — just give the construction.)
(b) Formulate axioms that could be used to define a matroid via its system of spanning sets. (Hint:
Describe spanning sets in terms of the dual matroid M ∗ .)
Exercise 3.9. Let E be a finite set and let ∆ be an abstract simplicial complexP on E. Let w : E → R≥0
be any function; think of w(e) as the “weight” of e. For A ⊆ E, define w(A) = e∈A w(e). Consider the
problem of maximizing w(A) over all facets A. A naive approach is the following greedy algorithm:
Step 1: Let A = ∅.
Step 2: If A is a facet of ∆, stop.
Otherwise, find e ∈ E \ A of maximal weight such that A + e ∈ ∆
(if there are several such e, pick one at random), and replace A with A + e.
Step 3: Repeat Step 2 until A is a facet of ∆.
This algorithm may or may not work for a given ∆ and w. Prove the following facts:
(a) Construct a simplicial complex and a weight function for which this algorithm does not produce a facet
of maximal weight. (Hint: The smallest example has |E| = 3.)
(b) Prove that the following two conditions are equivalent:
(i) The greedy algorithm produces a facet of maximal weight for every weight function w.
(ii) ∆ is a matroid independence system.
70
Note: This result does follow from Theorem 6.5.1. However, that is a substantial result, so don’t use it
unless you first do Exercise 6.8. It is possible to do this exercise by working directly with the definition of a
matroid independence system.
Exercise 3.10. Prove Proposition 3.8.2.
Exercise 3.11. Let X and Y be disjoint sets of vertices, and let B be an X, Y -bipartite graph: that is,
every edge of B has one endpoint in each of X and Y . For V = {x1 , . . . , xn } ⊆ X, a transversal of V is a
set W = {y1 , . . . , yn } ⊆ Y such that xi yi is an edge of B. (The set of all edges xi yi is called a matching.)
Let I be the family of all subsets of X that have a transversal; in particular I is a simplicial complex.
Prove that I is in fact a matroid independence system by verifying that the donation condition (I3) holds.
(Suggestion: Write down an example or two of a pair of independent sets I, J with |I| < |J|, and use the
corresponding matchings to find a systematic way of choosing a vertex that J can donate to I.) These
matroids are called transversal matroids; along with linear and graphic matroids, they are the other
“classical” examples of matroids in combinatorics.
Exercise 3.12. (Requires a bit of abstract algebra.) Let n be a positive integer, and let ζ be a primitive nth
root of unity. The cyclotomic matroid Yn is represented over Q by the numbers 1, ζ, ζ 2 , . . . , ζ n−1 , regarded as
elements of the cyclotomic field extension Q(ζ). Thus, the rank of Yn is the dimension of Q(ζ) as a Q-vector
space, which is given by the Euler φ function. Prove the following:
This problem is near and dear to my heart; the answer (more generally, a characterization of Yn for all n)
appears in [MR05].
71
Chapter 4
Throughout this section, let M be a matroid on ground set E with rank function r, and let n = |E|.
For A ⊆ E, we define
Corank and nullity measure how far A is from being spanning and independent, respectively. That is, the
corank is the minimum number of elements needed to adjoin to A to produce a spanning set (i.e., to intersect
all cocircuits), while the nullity is the minimum number of elements needed to delete from A to produce an
independent set (i.e., to break all circuits).
Definition 4.1.1. The Tutte polynomial of M is
X
TM = TM (x, y) := (x − 1)r(E)−r(A) (y − 1)|A|−r(A) . (4.1)
A⊆E
Example 4.1.2. If E = ∅ then TM (x, y) = 1. Mildly less trivially, if every element is a coloop, then
r(A) = |A| for all A, so X
TM = (x − 1)n−|A| = (x − 1 + 1)n = xn
A⊆E
by the binomial theorem. If every element is a loop, then the rank function is identically zero and we get
X
TM (y − 1)|A| = y n .
A⊆E
J
Example 4.1.3. For uniform matroids, corank and nullity depend only on cardinality, making their Tutte
polynomials easy to compute. U1 (2) has one set with corank 1 and nullity 0 (the empty set), two singleton
sets with corank 0 and nullity 0, and one doubleton with corank 0 and nullity 1, so
TU1 (2) = (x − 1) + 2 + (y − 1) = x + y.
72
Similarly,
J
Example 4.1.4. Let G be the graph below (known as the “diamond”):
a e c
Many invariants of M can be obtained by specializing the variables x, y appropriately. Some easy ones:
1. TM (2, 2) = A⊆E 1 = 2|E| . (Or, if you like, |E| = log2 TM (2, 2).)
P
2. Consider TM (1, 1). This kills off all summands whose corank is nonzero (i.e., all non-spanning sets)
and whose nullity is nonzero (i.e., all non-independent sets). What’s left are the bases, each of which
contributes a summand of 1. So TM (1, 1) = b(M ), the number of bases. We previously observed that
this quantity satisfies a deletion/contraction recurrence (Prop. 3.8.4); this will show up again soon.
3. Similarly, TM (1, 2) and TM (2, 1) count respectively the number of spanning sets and the number of
independent sets.
4. A little more generally, we can enumerate independent and spanning sets by their cardinality:
X
q |A| = q r(M ) T (1/q + 1, 1);
A⊆E independent
X
q |A| = q r(M ) T (1, 1/q + 1).
A⊆E spanning
5. TM (0, 1) is (up to a sign) the reduced Euler characteristic (see (6.2)) of the independence complex
73
of M :
X X
TM (0, 1) = (−1)r(E)−r(A) 0|A|−r(A) = (−1)r(E)−r(A)
A⊆E A⊆E independent
X
r(E) |A|
= (−1) (−1)
A∈I (M )
The fundamental theorem about the Tutte polynomial is that it satisfies a deletion/contraction recurrence.
In a sense it is the most general such invariant — we will give a “recipe theorem” that expresses any
deletion/contraction invariant as a Tutte polynomial specialization (more or less).
Theorem 4.1.5. The Tutte polynomial satisfies (and can be computed by) the following Tutte recurrence:
(T1) If E = ∅, then TM = 1.
(T2a) If e ∈ E is a loop, then TM = yTM \e .
(T2b) If e ∈ E is a coloop, then TM = xTM/e .
(T3) If e ∈ E is ordinary, then TM = TM \e + TM/e .
We can use this recurrence to compute the Tutte polynomial, by picking one element at a time to delete and
contract. The miracle is that it doesn’t matter what order we choose on the elements of E — all orders will
give the same final result! (In the case that M is a uniform matroid, then it is clear at this point that TM
is well-defined by the Tutte recurrence, because, up to isomorphism, M \e and M/e are independent of the
choices of e ∈ E.)
The Tutte recurrence says we can represent a calculation of TM by a binary tree in which moving down
corresponds to deleting or contracting:
M
M \e M \f
74
Example 4.1.8. Consider the diamond of Example 4.1.4. One possibility is to recurse on edge a (or
equivalently on b, c, or d). When we delete a, the edge d becomes a coloop, and contracting it produces a
copy of K3 . Therefore
T (G\a) = x(x2 + x + y)
by Example 4.1.7. Next, apply the Tutte recurrence to the edge b in G/a. The graph G/a\b has a coloop c,
contracting which produces a digon. Meanwhile, M (G/a/b) ∼ = U1 (3). Therefore
x(x2 + x + y)
x(x + y) x + y + y2
75
x3 x2 + x + y x(x + y) y(x + y)
Proof of Theorem 4.1.5. Let M be a matroid on ground set E, let e ∈ E, and let r0 and r00 be the rank
functions of M \e and M/e respectively. The definitions of rank function, deletion, and contraction imply
the following, for A ⊆ E − e:
A⊆E−e A⊆E−e
r 00 (E−e)+1−r 00 (A) |A|−r 00 (A) 00
(E−e)−r 00 (A) 00
X X
= X Y + Xr Y |A|−r (A)
A⊆E−e A⊆E−e
r 00 (E−e)−r 00 (A) |A|−r 00 (A)
X
= (X + 1) X Y = xTM/e .
A⊆E−e
76
For (T3), suppose that e is ordinary. Then
X
TM = X r(E)−r(A) Y |A|−r(A)
A⊆E
X
X r(E)−r(A) Y |A|−r(A) + X r(E)−r(A+e) Y |A+e|−r(A+e)
=
A⊆E−e
X 0 0 0 00 00 00
X r (E−e)−r (A) Y |A|−r (A) + X (r (E)+1)−(r (A)+1) Y |A|+1−(r (A)−1)
=
A⊆E−e
0 0 0 00
(E−e)−r 00 (A) 00
X X
= X r (E−e)−r (A) Y |A|−r (A) + Xr Y |A|−r (A)
A⊆E−e A⊆E−e
= TM \e + TM/e .
Some easy and useful observations (which illustrate, among other things, that both the rank-nullity and
recursive forms are valuable tools):
1. The Tutte polynomial is multiplicative on direct sums, i.e., TM1 ⊕M2 = TM1 TM2 . This is probably easier
to see from the rank-nullity generating function than from the recurrence.
2. Duality interchanges x and y, i.e.,
TM (x, y) = TM ∗ (y, x). (4.3)
This fact can be deduced either from the Tutte recurrence (since duality interchanges deletion and
contraction; see Prop. (3.8.2)) or from the corank-nullity generating function, by expressing r∗ in
terms of r (see Exercise 3.7).
3. The Tutte recurrence implies that every coefficient of TM is a nonnegative integer, a property which is
not obvious from the closed formula (4.1).
4.2 Recipes
The Tutte polynomial is often referred to as “the universal deletion/contraction invariant for matroids”: every
invariant that satisfies a deletion/contraction-type recurrence can be recovered from the Tutte polynomial.
This can be made completely explicit: the results in this section describe how to “reverse-engineer” a general
deletion/contraction recurrence for a graph or matroid isomorphism invariant to express it in terms of the
Tutte polynomial.
Theorem 4.2.1 (Tutte Recipe Theorem for Matroids). Let u(M ) be a matroid isomorphism invariant that
satisfies a recurrence of the form
1 if E = ∅,
Xu(M/e) if e ∈ E is a coloop,
u(M ) =
Y u(M \e) if e ∈ E is a loop,
au(M/e) + bu(M \e) if e ∈ E is ordinary
where E denotes the ground set of M and X, Y, a, b are either indeterminates or numbers, with a, b 6= 0.
Then
u(M ) = ar(M ) bn(M ) TM (X/a, Y /b).
Proof. Denote by r(M ) and n(M ) the rank and nullity of M , respectively. Note that
r(M ) = r(M \e) = r(M/e) + 1 and n(M ) = n(M \e) + 1 = n(M/e)
77
whenever deletion and contraction are well-defined. Define a new matroid invariant
and rewrite the recurrence in terms of ũ, abbreviating r = r(M ) and n = n(M ), to obtain
1
if E = ∅,
Xar−1 bn ũ(M/e)
if e ∈ E is a coloop,
ar bn ũ(M ) =
Y ar bn−1 ũ(M \e)
if e ∈ E is a loop,
r n
a b ũ(M/e) + ar bn ũ(M \e) if e ∈ E is ordinary.
Setting X = xa and Y = yb, we see that ũ(M ) = TM (x, y) = TM (X/a, Y /b) by Theorem 4.1.5, and rewriting
in terms of u(M ) gives the desired formula.
where G = (V, E) and X, Y, a, b, c are either indeterminates or numbers (with b, c 6= 0). Then
We omit the proof, which is similar to that of the previous result. A couple of minor complications are that
many deletion/contraction graph invariants involve the numbers of vertices or components, which cannot be
deduced from the matroid of a graph. Also, while deletion and contraction of a cut-edge of a graph produce
two isomorphic matroids, they do not produce two isomorphic graphs (so, no, that’s not a misprint in the
coloop case of Theorem 4.2.2). The invariant U is described by Bollobás as “the universal form of the Tutte
polynomial.”
We know that TM (x, y) has nonnegative integer coefficients and that TM (1, 1) is the number of bases of M .
These observations suggest that we should be able to interpret the Tutte polynomial as a generating function
for bases: that is, there should be combinatorially defined functions i, e : B(M ) → N such that
X
TM (x, y) = xi(B) y e(B) .
B∈B(M )
In fact, this is the case. The tricky part is that i(B) and e(B) must be defined with respect to a total order
e1 < · · · < en on the ground set E, so they are not really invariants of B itself. However, another miracle
occurs: the Tutte polynomial itself is independent of the choice of total order.
78
Definition 4.3.1. Let M be a matroid on E with basis system B and let B ∈ B. For e ∈ B, the
fundamental cocircuit of e with respect to B, denoted C ∗ (e, B), is the unique cocircuit in (E \ B) + e.
That is,
C ∗ (e, B) = {e0 : B − e + e0 ∈ B}.
Dually, for e 6∈ B, then the fundamental circuit of e with respect to B, denoted C(e, B), is the unique
circuit in B + e. That is,
C(e, B) = {e0 : B + e − e0 ∈ B}.
In other words, the fundamental cocircuit consists of e together with all elements outside B that can replace e
in a basis exchange, while the fundamental circuit consists of e together with all elements outside B that
can be replaced by e.
Suppose that M = M (G), where G its a connected graph, and B is a spanning tree. For all e ∈ B, the graph
B − e has two components, say X and Y , and C ∗ (e, B) is the set of all edges with one endpoint in each of
X and Y . Dually, if e 6∈ B, then B + e has exactly one cycle, and that cycle is C(e, B).
If M is a vector matroid, then C ∗ (e, B) consists of all vectors not in the codimension-1 subspace spanned
by B − e, and C(e, B) is the unique linearly dependent subset of B + e.
Definition 4.3.2. Let M be a matroid on a totally ordered vertex set E = {e1 < · · · < en }, and let B
be a basis of M . An element e ∈ B is internally active with respect to B if e is the minimal element of
C ∗ (e, B). An element e 6∈ B is externally active with respect to B if e is the minimal element of C(e, B).
We set
e1 e1 e1
e2 e5 e4 e2 e5 e4 e2 e5 e4
e3 e3 e3
79
Theorem 4.3.4 (Tutte). Let M be a matroid on E. Fix a total ordering of E and let e(B) and i(B) denote
respectively the number of externally active and internally active elements with respect to B. Then
X
TM (x, y) = xi(B) y e(B) . (4.4)
B∈B(M )
For instance, in Example 4.3.3, the spanning tree B contributes the monomial xy = x1 y 1 to T (G; x, y).
Tutte’s
P original paper [Tut54] actually defined the Tutte polynomial (which he called the “dichromate”)
as B∈B(M ) xi(B) y e(B) (rather than the corank-nullity generating function), then proved it that obeys the
deletion/contraction recurrence. Like the proof of Theorem 4.1.5, this result requires careful bookkeeping
but is not conceptually difficult. Note in particular that if e is a loop (resp. coloop), then e 6∈ B (resp. e ∈ B)
for every basis B, and C(e, B) = {e} (resp. C ∗ (e, B) = {e}), so e is externally (resp. internally) active with
respect to B, so the generating function (4.4) is divisible by y (resp. x).
We first show that the characteristic polynomial of a geometric lattice is a specialization of the Tutte
polynomial of the corresponding matroid.
Theorem 4.4.1. Let M be a simple matroid on E with rank function r and lattice of flats !L. Then
χ(L; k) = (−1)r(M ) TM (1 − k, 0).
We now claim that f (K) = µL (0̂, K). For each flat K ∈ L, let
X
g(K) = f (J)
J∈L: J⊆K
80
Theorem 4.4.1 gives another proof that the Möbius function of a semimodular lattice L weakly alternates in
sign, or specifically that (−1)r(L) µ(L) ≥ 0 (Theorem 2.4.7). First, if L is not geometric, or equivalently not
atomic, then by Corollary 2.4.10 µ(L) = 0. Second, if L is geometric, then by (2.6) and Theorem 4.4.1
The characteristic polynomial of a graphic matroid has a classical combinatorial interpretation in terms of
colorings. Let G = (V, E) be a connected graph. Recall that a k-coloring of G is a function f : V → [k], and
a coloring is proper if f (v) 6= f (w) whenever vertices v and w are adjacent. We showed in Example 2.3.5
that the function
pG (k) = number of proper k-colorings of G
is a polynomial in k, called the chromatic polynomial of G. In fact pG (k) = k · χK(G) (k). We can also
prove this fact via deletion/contraction.
• If G has a loop, then its endpoints automatically have the same color, so it’s impossible to color G
properly and pG (k) = 0.
• If G = Kn , then all vertices must have different colors. There are k choices for f (1), k − 1 choices for
f (2), etc., so pKn (k) = k(k − 1)(k − 2) · · · (k − n + 1).
• At the other extreme, the graph G = Kn with n vertices and no edges has chromatic polynomial k n ,
since every coloring is proper.
• If T is a tree with n vertices, then pick any vertex as the root; this imposes a partial order on the
vertices in which the root is 1̂ and each non-root vertex v is covered by exactly one other vertex p(v)
(its “parent”). There are k choices for the color of the root, and once we know f (p(v)) there are k − 1
choices for f (v). Therefore pT (k) = k(k − 1)n−1 . Qs
• If G has connected components G1 , . . . , Gs , then pG (k) = i=1 pGi (k). Equivalently, pG+H (k) =
pG (k)pH (k), where + denotes disjoint union of graphs.
Theorem 4.4.2. For every graph G
pG (k) = (−1)n−c k c · TG (1 − k, 0)
where n is the number of vertices of G and c is the number of components. In particular, pG (k) is a polynomial
function of k.
Proof. First, we show that the chromatic function satisfies the recurrence
pG (k) = k n if E = ∅; (4.7)
pG (k) = 0 if G has a loop; (4.8)
pG (k) = (k − 1)pG/e (k) if e is a coloop; (4.9)
pG (k) = pG\e (k) − pG/e (k) otherwise. (4.10)
We already know (4.7) and (4.8). Suppose e = xy is not a loop. Let f be a proper k-coloring of G \ e. If
f (x) = f (y), then we can identify x and y to obtain a proper k-coloring of G/e. If f (x) 6= f (y), then f is a
proper k-coloring of G. So (4.10) follows.
This argument applies even if e is a coloop. In that case, however, the component H of G containing e
becomes two components H 0 and H 00 of G \ e, whose colorings can be chosen independently of each other.
So the probability that f (x) = f (y) in any proper coloring is 1/k, implying (4.9).
81
The graph G \ e has n vertices and either c + 1 or c components, according as e is or is not a coloop.
Meanwhile, G/e has n − 1 vertices and c components. By induction,
n
k if E = ∅,
0 if e is a loop,
(−1)n−c k c TG (1 − k, 0) =
(1 − k)(−1)n+1−c k c TG/e (1 − k, 0) if e is a coloop,
(−1)n−c k c TG\e (1 − k, 0) + TG/e (1 − k, 0)
otherwise
n
k if E = ∅,
0 if e is a loop,
=
(k − 1)pG/e (k) if e is a coloop,
pG\e (k) − pG/e (k) otherwise
More generally, if G is a graph with n vertices and c components, then its graphic matroid M = M (G) has
rank n − c, whose associated geometric lattice is the connectivity lattice K(G). Combining Theorems 4.4.1
and 4.4.2 gives
pG (k) = k c χ(K(G); k).
An orientation is acyclic if it has no directed cycles. Let A(G) be the set of acyclic orientations of G, and
let a(G) = |A(G)|.
For example:
Colorings and orientations are intimately connected. Given a proper coloring f : V (G) → [k], one can
naturally define an acyclic orientation by directing each edge from the smaller to the larger color. (So #2 in
82
the above list is a special case of this.) The connection between them is the prototypical example of what is
called combinatorial reciprocity.
A (compatible) k-pair for a graph G = (V, E) is a pair (O, f ), where O is an acyclic orientation of G and
f : V → [k] is a coloring such that f (x) ≤ f (y) for every directed edge x → y in D. Let C(G) = C(G, k) be
the set of compatible k-pairs of G (we can safely drop k from the notation)
Theorem 4.5.1 (Stanley’s Acyclic Orientation Theorem). For every graph G and positive integer k,
|C(G, k)| = (−1)n pG (−k) = k c TG (1 + k, 0). (4.11)
Proof. The second equality follows from Theorem 4.4.2, so we prove the first one. Let n = |G|.
If G has no edges then |C(G)| = k n = (−1)n (−k)n = (−1)n pG (−k), confirming (4.11).
If G has a loop then it has no acyclic orientations, hence no k-pairs for any k, so both sides of (4.11) are
zero.
Let e = xy be an edge of G that is not a loop. Denote the left-hand side of (4.11) by p̄G (k). Then
p̄G (k) = (−1)n pG (−k) = (−1)n (pG\e (−k) − pG/e (−k))
= (−1)n ((−1)n p̄G\e (k) − (−1)n−1 p̄G/e (k))
= p̄G\e (k) + p̄G/e (k)
so we need to show that |C(G)| satisfies the same recurrence. Write
Say that a pair (O, f ) ∈ C(G) is reversible (with respect to e) if reversing e produces a compatible pair
(O0 , f ); otherwise it is irreversible. (Reversibility is equivalent to saying that f (x) = f (y) and that G does
not contain a directed path from either endpoint of e to the other.) Let Crev (G) and Cirr (G) denote the sets
of reversible and irreversible compatible pairs, respectively.
If e is reversible, then contracting it to a vertex z and defining f (z)−f (x)−f (y) produces a compatible pair of
G/e. (The resulting orientation is acyclic because any directed cycle lifts to either a directed cycle in G, or an
oriented path between the endpoints of e, neither of which exists.) This defines a map ψ : Crev (G) → C(G/e),
which is 2-to-1 because ψ(O, f ) = ψ(O0 , f ). Moreover, ψ is onto: any (O, f ) ∈ C(G/e) can be lifted to
(Õf˜) ∈ C(G) by defining f˜(x) = f˜(y) = f (z) and orienting e in either direction (the acyclicity of O means
that there is no oriented path from either x or y to the other in Õ). We conclude that
|Crev (G)|
|C(G/e)| = . (4.12)
2
There is a map ω : C(G) → C(G − e) given by deleting e. I claim that ω is surjective, which is equivalent
to saying that it is always possible to extend any element of C(G − e) to C(G) by choosing an appropriate
orientation for e. (If f (x) < f (y), then O has no y, x-path by compatibility. If f (x) = f (y) and neither
orientation of e is acyclic, then O must contain a directed path from each of x, y to the other, hence is not
acyclic.) The map ω is 1-to-1 on Cirr (G) but 2-to-1 on Crev (G) (for the same reason as ψ). Therefore,
|Crev (G)|
|C(G − e)| = |Cirr (G)| + . (4.13)
2
83
In particular, if k = 1 then there is only one choice for f and every acyclic orientation is compatible
with it, which produces the following striking corollary (often referred to as “Stanley’s theorem on acyclic
orientations,” although Stanley himself prefers that name for the more general Theorem 4.5.1).
Theorem 4.5.2. The number of acyclic orientations of G is |pG (−1)| = TG (2, 0).
Combinatorial reciprocity can be viewed geometrically. For more detail, look ahead to Section 5.5 and/or
see a source such as Beck and Robins [BR07], but here is a brief taste.
Let G be a simple graph on n vertices. The graphic arrangement AG is the union of all hyperplanes
in Rn defined by the equations xi = xj where ij is an edge of G. The complement Rn \ AG consists of
finitely many disjoint open polyhedra (the “regions” of the arrangement), each of which is defined by a set
of inequalities, including either xi < xj or xi > xj for each edge. Thus each region naturally gives rise
to an orientation of G, and it is not hard to see that the regions are in fact in bijection with the acyclic
orientations. Meanwhile, a k-coloring of G can be regarded as an integer point in the cube [1, k]n ⊆ Rn , and
a proper coloring corresponds to a point that does not lie on any hyperplane in AG . In this setting, Stanley’s
theorem is an instance of something more general called Ehrhart reciprocity (which I will add notes on at
some point).
Definition 4.6.1. A linear code C is a subspace of (Fq )n , where q is a prime power and Fq is the field of
order q. The number n is the length of C . The elements c = (c1 , . . . , cn ) ∈ C are called codewords. The
support of a codeword is supp(c) = {i ∈ [n] : ci 6= 0}, and its weight is wt(c) = | supp(c)|. The weight
enumerator of C is the polynomial X
WC (t) = twt(c) .
c∈C
For example, let C be the subspace of F32 generated by the rows of the matrix
1 0 1
X= ∈ (F2 )3×2 .
0 1 1
The dual code C ⊥ is the orthogonal complement under the standard inner product. This inner product is
nondegenerate, i.e., dim C ⊥ = n − dim C . (Note, though, that a subspace and its orthogonal complement
can intersect nontrivially. A space can even be its own orthogonal complement, such as {00, 11} ⊆ F22 . This
does not happen over R, where the inner product is not only nondegenerate but also positive-definite, but
“positive” does not make sense over a finite field.) In this case, C ⊥ = {000, 111} and WC ⊥ (t) = 1 + t3 .
Theorem 4.6.2 (Curtis Greene, 1976). Let C be a linear code of length n and dimension r over Fq , and let
M be the matroid represented by the columns of a matrix X whose rows are a basis for C . Then
1 + (q − 1)t 1
n−r r
WC (t) = t (1 − t) TM ,
1−t t
The proof is a deletion-contraction argument. As an example, if C = {000, 101, 011, 110} ⊆ F32 as above,
then the matroid
M is U2 (3). Its Tutte polynomial is x2 + x + y, and Greene’s theorem gives WC (t) =
1+t 1
t(1 − t)2 TM 1−t , t = 1 + 3t2 as noted above (calculation omitted).
84
If X ⊥ is a matrix whose rows are a basis for the dual code, then the corresponding matroid M ⊥ is precisely
the dual matroid to M . We know that TM (x, y) = TM ⊥ (y, x) by (4.3), so setting s = (1 − t)/(1 + (q − 1)t)
(so t = (1 − s)/(1 + (q − 1)s); isn’t that convenient?) gives
1 + (q − 1)s 1
r n−r
WC ⊥ (t) = t (1 − t) TM ,
1−s s
= tr (1 − t)n−r sr−n (1 − s)−r WC (s),
or rewriting in terms of t,
1 + (q − 1)tn 1−t
WC ⊥ (t) = WC
qr 1 + (q − 1)t
which is known as the MacWilliams identity and is important in coding theory.
4.7 Exercises
Exercise 4.1. An orientation of a graph is called totally cyclic if every edge belongs to a directed cycle.
Prove that the number of totally cyclic orientations of G is TG (0, 2).
Exercise 4.2. Let G be a finite graph with n vertices, r edges, and k components. Fix an orientation O
on E(G). Let I(v) (resp., O(v)) denote the set of edges entering (resp., leaving) each vertex v. Let q be a
positive integer and Zq = Z/qZ. A nowhere-zero q-flow (or q-NZF) on G (with respect to O) is a function
φ : E(G) → Zq \ {0} satisfying the conservation law
X X
φ(e) = φ(e)
e∈I(v) e∈O(v)
for every v ∈ V (G). Let FGO (q) denote the set of nowhere-zero q-flows and fG
O
(q) = |FGO (q)|.
O
(i) Prove that fG (q) depends only on the graph G, not on the choice of orientation (so we are justified in
writing fG (q)).
(Interestingly, Z/qZ can be replaced with any abelian group of cardinality q without affecting the result.)
Exercise 4.3. Let G = (V, E) be a graph with n vertices and c components. For a vertex coloring f : V → P,
let i(f ) denote the number of “improper” edges, i.e., whose endpoints are assigned the same color. The
(Crapo) coboundary polynomial of G is
X
χ̄G (q; t) = q −c ti(f ) .
f :V →[q]
This is evidently a stronger invariant than the chromatic polynomial of G, which can be obtained as q χ̄G (q, 0).
In fact, the coboundary polynomial provides the same information as the Tutte polynomial. Prove that
−
χ̄G (q; t) = (t − 1) n−c q + t 1
TG , t
t−1
85
Exercise 4.4. Let M be a matroid on E and let 0 ≤ p ≤ 1. The reliability polynomial RM (p) is the
probability that the rank of M stays the same when each ground set element is independently retained with
probability p and deleted with probability 1 − p. (In other words, we have a family of i.i.d. random variables
{Xe : e ∈ E}, each of which is 1 with probability p and 0 with probability 1−p. Let A = {e ∈ E : Xe = 1}.
Then RM (p) is the probability that r(A) = r(E).) Give a formula for RM (p) in terms of the Tutte polynomial,
using
(a) the definition of the Tutte polynomial as the corank/nullity generating function;
(b) the Tutte Recipe Theorem.
Exercise 4.5. Prove Merino’s theorem on critical configurations of the chip-firing game. (This needs details!)
Exercise 4.6. Prove Theorem 4.3.4.
Much, much more about the Tutte polynomial can be found in [BO92], the MR review of which begins,
“The reviewer, having once worked on that polynomial himself, is awed by this exposition of its present
importance in combinatorial theory.” (The reviewer was one W.T. Tutte.)
86
Chapter 5
Hyperplane Arrangements
An excellent source for the combinatorial theory of hyperplane arrangements is Stanley’s book chapter
[Sta07], which is accessible to newcomers, and includes a self-contained treatment of topics such as the
Möbius function and characteristic polynomial. Another canonical (but harder) source is the monograph by
Orlik and Terao [OT92].
Definition 5.1.1. Let k be a field, typically either R or C, and let n ≥ 1. A linear hyperplane in kn
is a vector subspace of codimension 1. An affine hyperplane is a translate of a linear hyperplane. A
hyperplane arrangement A ⊆ kn is a finite set of (distinct) hyperplanes H1 , . . . , Hk ⊆ kn . The number n
is called the dimension of A, and the space kn is its ambient space. The intersection poset L(A) is the
poset of all intersections
T of subsets of A, ordered by reverse inclusion. If B ⊆ A is a subset of hyperplanes,
we write ∩B for H∈B H. The characteristic polynomial of A is
X
χA (t) = µ(0̂, x)tdim x . (5.1)
x∈L(A)
This is essentially the same as the characteristic polynomial of L(A), up to a correction factor that we will
explain soon.
Example 5.1.2. Two line arrangements in R2 are shown in Figure 5.1. The arrangement A1 consists of
the lines x = 0, y = 0, and x = y. The arrangement A2 consists of the four lines `1 , `2 , `3 , `4 given by the
equations y = 1, x = y, x = −y, y = −1 respectively. The intersection posets L(A1 ) and L(A2 ) are shown
in Figure 5.2; the characteristic polynomials are t2 − 3t + 2 and t2 − 4t + 5 respectively. J
Example 5.1.3. The Boolean arrangement Booln (or coordinate arrangement) consists of the n
coordinate hyperplanes in n-space. Its intersection poset is the Boolean algebra Booln (I make no apologies
for abusing notation by referring to the arrangement and the poset with the same symbol). More generally,
any arrangement whose intersection poset is Boolean might be referred to as a Boolean arrangement. J
n
Example 5.1.4. The braid arrangement Brn consists of the 2 hyperplanes xi = xj in n-space. Its
intersection poset is naturally identified with the partition lattice Πn . This is simply because any set of
equalities among x1 , . . . , xn defines an equivalence relation on [n], and certainly every equivalence relation
can be obtained in this way. For instance, the intersection poset of Br3 is as follows:
87
`1
`2
`3
A1 `4 A2
{0} • • • • •
L(A1 ) R2 L(A2 ) R2
x=y=z 123
R3 1|2|3
Note that the poset Πn = L(Br3 ) has characteristic polynomial t2 − 3t + 2, but the arrangement Br3 has
characteristic polynomial t3 − 3t2 + 2t. J
Example 5.1.5. If G = (V, E) is a simple graph on vertex set V = [n], then the corresponding graphic
arrangement AG is the subarrangement of Brn consisting of those hyperplanes xi = xj for which ij ∈ E.
Thus Brn itself is the graphic arrangement of the complete graph Kn . Moreover, the intersection poset of
AG is precisely the connectivity lattice K(G) defined in Example 1.2.3. J
Figure 5.3 shows some hyperplane arrangements in R3 . Note that every hyperplane in Brn contains the line
x1 = x2 = · · · = xn ,
so projecting R4 along that line allows us to picture Br4 as an arrangement ess(Br4 ) in R3 . (The symbol
“ess” means essentialization, to be defined precisely soon.) The second two figures were produced using the
computer algebra system Sage [S+ 14].
The poset L(A) is the fundamental combinatorial invariant of A. Some easy observations:
88
Bool3 Br3 ess(Br4)
Figure 5.3: Three hyperplane arrangements in R3 .
A}. In fact, the intersection poset is invariant under any affine transformation. (The group of affine
transformations is generated by the invertible linear transformations together with translations.)
2. The poset L(A) is a meet-semilattice, with meet given by ∩B ∧ ∩C = ∩(B ∩ C) for all B, C ⊆ A. Its 0̂
element is ∩∅, which by convention is kn .
3. L(A) is ranked, with rank function r(X) = n − dim X. To see this, observe that each covering relation
X l Y comes from intersecting an affine linear subspace X with a hyperplane H that neither contains nor
is disjoint from X, so that dim(X ∩ H) = dim X − 1.
4. L(A) has a 1̂ element if and only if the center ∩A is nonempty. Such an arrangement is called central.
In this case L(A) is a lattice (and may be referred to as the intersection lattice of A). Since translation
does not affect whether an arrangement is central (or indeed any of its combinatorial structure), we will
typically assume that ∩A contains the zero vector, which is to say that every hyperplane in A is a linear
hyperplane in kn . (So an arrangement is central if and only if it is a translation of an arrangement of linear
hyperplanes.)
5. When A is central, the lattice L(A) is geometric. It is atomic by definition, and it is submodular because
it is a sublattice of the chain-finite modular lattice L(kn )∗ (the lattice of all subspaces of kn ordered by
reverse inclusion). The associated matroid M (A) = M (L(A)) is represented over k by any family of vectors
{nH : H ∈ A} where nH is normal to H. (That is, H ⊥ = khnH i with respect to some fixed non-degenerate
bilinear form on kn .) Any normals will do, since the matroid is unchanged by scaling the nH independently.
Therefore, all of the tools we have developed for looking at posets, lattices and matroids can be applied to
study hyperplane arrangements.
The dimension of an arrangement is not a combinatorial invariant; that is, it cannot be extracted from
the intersection poset. If Br4 were a “genuine” 4-dimensional arrangement then we would not be able to
represent it in R3 . However, we can do so because the center of Br4 has positive dimension, so squashing the
center to a point reduces the ambient dimension without changing the intersection poset. This observation
motivates the following definition.
Definition 5.1.6. Let A ⊆ kn be an arrangement and let N (A) = khnH : H ∈ Ai, where nH is normal
to H. The essentialization of A is the arrangement
ess(A) = {H ∩ N (A) : H ∈ A} ⊆ N (A).
89
We say that A is essential if ess(A) = A, or equivalently if N (A) = kn . Note that L(ess(A)) ∼
= L(A) as
posets. The rank of A is the dimension of its essentialization.
χA (t) = tdim N (A) χL(A) (t) = tdim A−rank A χL(A) (t). (5.2)
The two polynomials coincide for essential arrangements. For example, rank Brn = dim ess(Brn ) = n − 1,
and rank AG = r(G) = |V (G)| − c, where c is the number of connected components of G.
If A is linear, then we could define the essentialization by setting V = N (A)⊥ = ∩A, and then defining
ess(A) = {H/V : H ∈ A} ⊆ kn /V . Thus A is essential if and only if ∩A = 0. Moreover, if A is linear
then rank(A) is the rank of its intersection lattice — so rank is a combinatorial invariant, unlike dimension.
There are two natural operations that go back and forth between central and non-central arrangements,
called projectivization and coning.
Definition 5.1.7. Let k be a field and n ≥ 1. The set of lines through the origin in kn is called n-
dimensional projective space over k and denoted by Pn−1 k.
If k = R, we can regard Pn−1 R as the unit sphere Sn−1 with opposite points identified. (In particular, it is
an (n − 1)-dimensional manifold, although it is orientable only if n is even.)
Algebraically, write x ∼ y if x and y are nonzero scalar multiples of each other. Then ∼ is an equivalence
relation on kn \ {0}, and Pn−1 is the set of equivalence classes.
Linear hyperplanes in kn correspond to affine hyperplanes in Pn−1 k. Thus, given a central arrangement
A ⊆ kn , we can construct its projectivization proj(A) ⊆ Pn−1 k.
Projectivization supplies a nice way to draw central 3-dimensional real arrangements. Let S be the unit
sphere, so that H ∩ S is a great circle for every H ∈ A; then regard H0 ∩ S as the equator and project the
northern hemisphere into your piece of paper. Several examples as shown below. Of course, a diagram of
proj(A) only shows the upper half of A; we can recover A from proj(A) by “reflecting the interior of the
disc to the exterior” (Stanley); e.g., for the Boolean arrangement A = Bool3 , the picture is as shown in the
fourth figure below. In general, r(proj(A)) = 21 r(A).
The operation of coning is a sort of inverse of projectivization. It lets us turn a non-central arrangement
into a central arrangement, at the price of increasing the dimension by 1.
90
Definition 5.1.8. Let A ⊆ kn be a hyperplane arrangement, not necessarily central. The cone cA is the
central arrangement in kn+1 defined as follows:
• Geometrically: Make a copy of A in kn+1 , choose a point p not in any hyperplane of A, and replace
each H ∈ A with the affine span H 0 of p and H (which will be a hyperplane in kn+1 ). Then, toss in
one more hyperplane containing p and in general position with respect to every H 0 .
• Algebraically: For H = {x : L(x) = ai } ∈ A (with L a homogeneous linear form on kn and ai ∈ k),
construct a hyperplane H 0 = {(x1 , . . . , xn , y) : L(x) = ai y} ⊆ kn+1 in cA. Then, toss in the
hyperplane y = 0.
For example, if A consists of the points x = 0, x = −3 and x = 1 in R1 (shown in red), then cA consists of
the lines x = y, x = −5y, x = 3y, and y = 0 in R2 (shown in blue).
y=1
y=0
Let A ⊆ Rn be a real hyperplane arrangement. The regions of A are the connected components of Rn \ A.
Each component is the interior of a (bounded or unbounded) polyhedron; in particular, it is homeomorphic
to Rn . We call a region relatively bounded if the corresponding region in ess(A) is bounded. (If A is
not essential then every region is unbounded, because it contains a translate of W ⊥ , where W is the space
defined in Definition 5.1.6. Therefore passing to the essentialization is necessary to make the problem of
counting bounded regions nontrivial for all arrangements.) Let
Example 5.2.2. The Boolean arrangement Booln consists of the n coordinate hyperplanes in Rn . It is
a central, essential arrangement whose intersection lattice is the Boolean lattice of rank n; accordingly,
χBooln (t) = (t − 1)n . The complement Rn \ Booln is {(x1 , . . . , xn ) : xi 6= 0 for all i}, and the connected
components are the open orthants, specified by the signs of the n coordinates. Therefore, r(Booln ) = 2n and
b(Booln ) = 0. J
Example 5.2.3. Let A consist of m lines in R2 in general position: that is, no two lines are parallel and no
three are coincident. Draw the dual graph G, whose vertices are the regions of A, with an edge between
every two regions that share a common border.
91
A
G
Let r = r(A) and b = b(A), and let v, e, f denote the numbers of vertices, edges and faces of G, respectively.
(In the example above, (v, e, f ) = (11, 16, 7).) Each bounded face of G is a quadrilateral that contains exactly
one point where two lines of A meet, and the unbounded face is a cycle of length r − b. Therefore,
v = r, (5.3a)
m2 − m + 2
m
f = 1+ = (5.3b)
2 2
4(f − 1) + (r − b) = 2e. (5.3c)
Moreover, the number r − b of unbounded region of As is just 2m. (Take a walk around a very large circle.
You will enter each unbounded region once, and will cross each line twice.) Therefore, from (5.3c) and (5.3b)
we obtain
e = m + 2(f − 1) = m2 . (5.3d)
Euler’s formula for planar graphs says that v − e + f = 2. Substituting in (5.3a), (5.3b) and (5.3d) and
solving for r gives
m2 + m + 2
r =
2
and therefore
m2 − 3m + 2 m−1
b = r − 2m = = .
2 2
J
n
Example 5.2.4. The braid arrangement Brn consists of the 2 hyperplanes Hij = {x : xi = xj } in R . n
The complement Rn \ Brn consists of all vectors in Rn with no two coordinates equal, and the connected
components of this set are specified by the ordering of the set of coordinates as real numbers:
y=x
y<x<z x<y<z
z=y
y<z<x x<z<y
z=x
z<y<x z<x<y
92
Therefore, r(Brn ) = n!. (Stanley: “Rarely is it so easy to compute the number of regions!”) Furthermore,
Note that the braid arrangement is central but not essential; its center is the line x1 = x2 = · · · = xn , so its
rank is n − 1. J
Example 5.2.5. Let G = (V, E) be a simple graph with V = [n], and let AG be its graphic arrangement
(see Example 5.1.5). The characteristic polynomial of L(AG ) is precisely the chromatic polynomial of G (see
Section 4.4). We will see another explanation for this fact later; see Example 5.4.4.
The regions of Rn \ AG are the open polyhedra whose defining inequalities include either xi < xj or xi > xj
for each edge ij ∈ E. Those inequalities give rise to an orientation of G, and it is not hard to check that
this correspondence is a bijection between regions and acyclic orientations. Hence
Example 5.2.5 motivates the main result of this section, Theorem 5.3.6, which was historically the first major
theorem about hyperplane arrangements, due to Zaslavsky [Zas75]. Let A be a real hyperplane arrangement,
and let χA be the characteristic polynomial of its intersection poset. Zaslavsky’s Theorem(s) say(s) that the
numbers of regions and relatively bounded regions are given by
The proof combines geometry and combinatorics. Here is an overview of the steps:
1. Show that r and b satisfy restriction/contraction recurrences in terms of associated hyperplane ar-
rangements A0 and A00 (Prop. 5.3.2).
2. Rewrite the characteristic polynomial χA (k) as a sum over central subarrangements of A (the “Whitney
formula”, Prop. 5.3.3).
3. Show that the Whitney formula obeys a restriction/contraction recurrence (Prop. 5.3.4) and compare
it with those for r and b.
Let x ∈ L(A), i.e., x is a nonempty affine space formed by intersecting some of the hyperplanes in A. Define
Ax = {H ∈ A : H ⊇ x},
(5.4)
Ax = {H ∩ x : H ∈ A \ Ax }.
In other words, Ax is obtained by deleting the hyperplanes not containing x, while Ax is obtained by
restricting A to x so as to get an arrangement whose ambient space is x itself. The notation is mnemonic:
L(Ax ) and L(Ax ) are isomorphic respectively to the principal order ideal and principal order filter generated
by x in L(A). That is,
L(Ax ) ∼
= {y ∈ L(A) : y ≤ x}, L(Ax ) ∼
= {y ∈ L(A) : y ≥ x}.
Example 5.3.1. Let A be the 2-dimensional arrangement shown on the left, with the line H and point p
as shown. Then Ap and AH are shown on the right.
93
H H
p p
A Ap AH
The lattice L(A) and its subposets (in this case, sublattices) L(Ap ) and L(AH ) are shown below.
L(Ap ) L(AH )
Let M (A) be the matroid represented by normal vectors {nH : H ∈ A}. Fix a hyperplane H ∈ A and let
A0 = A \ H and A00 = AH .
Proposition 5.3.2. The invariants r and b satisfy the following recurrences:
0 00
1. r(A) = r(A
) + r(A ).
0 if rank A = rank A0 + 1 (i.e., if nH is a coloop in M (A)),
2. b(A) =
b(A0 ) + b(A00 ) if rank A = rank A0 (i.e., if it isn’t).
Proof. (1) Consider what happens when we add H to A0 to obtain A. Some regions of A0 will remain the
same, while others will be split into two regions.
unsplit
split split
unsplit
0
A A0
unsplit unsplit
split
unsplit
H
Let S and U be the numbers of split and unsplit regions of A0 (in the figure above, S = 2 and U = 4).
The unsplit regions each contribute 1 to r(A). The split regions each contribute 2 to r(A), but they also
94
correspond bijectively to the regions of A00 . (See, e.g., Example 5.3.1.) So
and so r(A) = r(A0 ) + r(A00 ), proving the first assertion of Proposition 5.3.2. By the way, if (and only if) H
is a coloop then it borders every region of A, so r(A) = 2r(A0 ) in this case.
(2) Now we count bounded regions. If rank A = rank A0 + 1, then N (A0 ) ( Rn , i.e., A0 is not essential. In
that case, every region of A0 must contain a line (or possibly a bigger space) orthogonal to N (A0 ), which gets
squashed down to a point upon essentialization. Therefore, every region of A contains a ray, and b(A) = 0.
This takes care of the first case. In the second case, the bounded regions of A come in a few different flavors.
W W
X
s Z
H XY
Y ZU
U
A A0
Contributions to. . .
Description b(A) b(A0 ) b(A00 )
(W) bounded regions that don’t touch H 1 1 0
(X, Y) pairs of bounded regions separated by H 2 1 1
(Z) bounded, neighbor across H is unbounded 1 0 1
In all cases the contribution to b(A) equals the sum of those to b(A0 ) and b(A00 ), establishing the second
desired recurrence.
Proposition 5.3.2 looks a lot like a Tutte polynomial deletion/contraction recurrence. This suggests that we
should be able to extract r(A) and b(A) from the characteristic polynomial χA . The first step is to find a
more convenient form for the characteristic polynomial.
Proposition 5.3.3 (Whitney formula for χA ). For any hyperplane arrangement A,
X
χA (t) = (−1)|B| tdim A−rank B .
central B⊆A
95
Proof. The atoms in the interval [0̂, x] are the hyperplanes of A containing x, and they form a lower crosscut
of [0̂, x]. Therefore
X
χA (t) = µ(0̂, x)tdim x
x∈L(A)
X X
= (−1)|B| tdim x
T
x∈L(A) B⊆A: x= B
as desired. Note that the empty subarrangement is considered central for the purpose of this formula,
corresponding to the summand x = 0̂ and giving rise to the leading term tdim A of χA (t).
Proposition 5.3.4. Let A be a hyperplane arrangement in kn . Then χA (t) = χA0 (t) − χA00 (t).
Then Σ1 = χA0 (t) (it is just Whitney’s formula for A0 ), so it remains to show that Σ2 = −χA00 (t). This is a
little trickier, because different hyperplanes in A can have the same intersection with H, which means that
multiple subarrangements of A can give rise to the same subarrangement of A00 .
Label the hyperplanes of A00 (which, remember, are codimension-1 subspaces of H) as K1 , . . . , Ks . For each
i ∈ [s] let Ai = {J ∈ A : J ∩ H = Ki }. Each arrangement B arising as a summand of Σ2 gives rise to a
central subarrangement of A00 , namely
π(B) = {J ∩ H : J ∈ B},
that depends only on the family of Ai ’s for which Ai ∩ B = 6 ∅. That is, for each central subarrangement
B 00 ⊆ A00 , the summands B of Σ2 such that π(B) = B 00 are precisely the arrangements of the form
[
{H} ∪ Bi
i: Ki ∈B00
96
Now we break up the sum Σ2 into subsums depending on π(B):
X X
Σ2 = (−1)|B| tn−rank B
central B00 ⊆A00 B∈π −1 (B00 )
X X 00
= (−1)|B| tdim H−rank B
B00 B∈π −1 (B00 )
X Y X 00
= − (−1)|Bi | tdim H−rank B
B00 i: Ki ∈B0 ∅6=Bi ⊆Ai
(to see this, expand the product and observe that equals Pthe inner sum in the previous line; the outer minus
sign is contributed by H, which is an element of B). But ∅6=Bi ⊆Ai (−1)|Bi | = −1, because it is the binomial
expansion of (1 − 1)|Ai | = 0, with one +1 term (namely Bi = ∅) removed. (Note that Ai 6= ∅.) Therefore,
the whole thing boils down to X 00 00
− (−1)|B | tdim H−rank B
B00
which is just Whitney’s formula for −χA00 (t).
Remark 5.3.5. This recurrence is strongly reminiscent of the chromatic recurrence (4.10). Indeed, if
A = AG is a graphic arrangement in Rn , e is an edge of G, and He is the corresponding hyperplane in AG ,
then it is clear that AG\e = AG \ {He }. In addition, two hyperplanes Hf , Hf 0 will have the same intersection
with He if and only if f, f 0 become parallel upon contracting e, so AG/e can be identified with (AG )He (where
the coordinates on He ∼ = Rn−1 are given by equating the coordinates for the two endpoints of e).
We can now finish the proof of the main result. We have already done the hard work, and just need to put
all the pieces together.
Theorem 5.3.6 (Zaslavsky’s Theorem). Let A be a real hyperplane arrangement, and let χA be the char-
acteristic polynomial of its intersection poset. Then
r(A) = (−1)dim A χA (−1) and (5.6)
b(A) = (−1)rank A χA (1). (5.7)
Proof. Let r̃(A) and b̃(A) denote the numbers on the right-hand sides of (5.6) and (5.6).
If |A| = 1, then L(A) is the lattice with two elements, namely Rn and a single hyperplane H, and its
characteristic polynomial is tn − tn−1 . Thus r̃(A) = (−1)n ((−1)n − (−1)n−1 ) = 2 and b̃(A) = −(1 − 1) = 0,
which match r(A) and b(A).
For the general case, we just need to show that r̃ and b̃ satisfy the same recurrences as r and b (see Prop. 5.3.2).
First,
r̃(A) = (−1)dim A χA (−1)
= (−1)dim A χA0 (−1) − χA00 (−1)
(by Prop. 5.3.4)
0 00
= (−1)dim A χA0 (−1) + (−1)dim A χA00 (−1) (since dim A00 = dim A − 1)
0 00
= r̃(A ) + r̃(A ).
As for b̃, if rank A = rank A0 + 1, then in fact A0 and A00 have the same essentialization, hence the same
rank, and their characteristic polynomials only differ by a factor of t. The deletion/restriction recurrence
(Prop. 5.3.4) therefore implies b̃(A) = 0.
97
On the other hand, if rank A = rank A0 , then rank A00 = rank A − 1 and a calculation similar to that for r̃
(replacing dimension with rank) shows that b̃(A) = b̃(A0 ) + b̃(A00 ).
Corollary 5.3.7. Let A ⊆ Rn be a central hyperplane arrangement and let M = M (A) be the matroid
represented by normals. Then r(A) = TM (2, 0) and b(A) = 0.
Proof. Combine Zaslavsky’s theorem with the formula χA (t) = (−1)n TM (1 − t, 0), and use the fact that
TM (0, 0) = 0 for any matroid M with nonempty ground set.
Remark 5.3.8. The formula for r(A) could be obtained from the Tutte Recipe Theorem (Thm. 4.2.1). But
this would not work for b(A), which is not an invariant of M (A). (The matroid M (A) is not as meaningful
when A is not central, which is precisely the case that b(A) is interesting.)
Example 5.3.9. Let s ≥ n, and let A be an arrangement of s linear hyperplanes in general position
in Rn ; that is, every k hyperplanes intersect in a space of dimension n − k (or 0 if k > n). Equivalently,
the corresponding matroid M is Un (s), whose rank function r : 2[s] → N is given by r(A) = min(n, |A|).
Therefore,
X
r(A) = TM (2, 0) = (2 − 1)n−r(A) (0 − 1)|A|−r(A)
A⊆[s]
X
= (−1)|A|−r(A)
A⊆[s]
s
X s
= (−1)k−min(n,k)
k
k=0
n s
X s X s
= + (−1)k−n
k k
k=0 k=n+1
n s
X s X s
= (1 − (−1)k−n ) + (−1)k−n
k k
k=0 k=0
| {z }
=0
s s s
= 2 + + + ··· .
n−1 n−3 n−5
Notice that this is not the same as the number of regions formed by s affine lines in general position in R2 .
The calculation of r(A) and b(A) for that arrangement is left to the reader (Exercise 5.1).
Corollary 5.3.10. Let A be an arrangement in which no two hyperplanes are parallel. Then A has at
least one relatively bounded region if and only if it is noncentral. Prove this and find a place for it —
assuming it is true. The non-parallel assumption is necessary since the conclusion fails for
the arrangement with hyperplanes x = 0, y = 0, y = 1.
98
5.4 The finite field method
The following very important result is implicit in the work of Crapo and Rota [CR70] and was stated
explicitly by Athanasiadis [Ath96]:
Theorem 5.4.1. Let Fq be the finite field of order q, and let A ⊆ Fnq be a hyperplane arrangement. Then
|Fnq \ A| = χA (q).
This result gives a combinatorial interpretation of the values of the characteristic polynomial. In practice,
it is often used to calculate the characteristic polynomial of a hyperplane arrangement by counting points
in its complement over Fq (which can be regarded as regions of the complement, if you endow Fnq with the
discrete topology).
Proof #2. Start with the definition of the characteristic polynomial, letting r be the rank function in L(A):
X
χA (q) = µ(0̂, x)q n−r(x)
x∈L(A)
X
= µ(0̂, x)q dim x
x∈L(A)
X
= µ(0̂, x)|x|
x∈L(A)
X X
= µ(0̂, x)
p∈Fn
q x∈L(A): p∈x
X X
= µ(0̂, x)
p∈Fn
q x∈[0̂,yp ]
T
where yp = H⊇p H. By definition of the Möbius function, the parenthesized sum is 1 if yp = 0̂ and 0
otherwise. Therefore
χA (q) = #{p ∈ Fnq : yp = 0̂} = #{p ∈ Fnq : p 6∈ H ∀H ∈ A} = |Fnq \ A|.
This fact has a much more general application, which was systematically mined by Athanasiadis, e.g., [Ath96].
Definition 5.4.2. Let A ⊆ Rn be an integral hyperplane arrangement (i.e., whose hyperplanes are defined
by equations with integer coefficients). For a prime p, let Ap = A ⊗ Fp be the arrangement in Fnp defined
by regarding the equations in A as equations over Fp . We say that A reduces correctly modulo p if
L(Ap ) ∼
= L(A).
99
A sufficient condition for correct reduction is that no minor of the matrix of normal vectors is a nonzero
multiple of p (so that rank calculations are the same over Fp as over Z). In particular, to choose p larger
than the absolute value of any minor of M , so that a set of columns of M is linearly independent over Fp iff
it is independent over Q. There are infinitely many such primes, implying the following highly useful result:
Theorem 5.4.3 (The finite field method). Let A ⊆ Rn be an integral hyperplane arrangement and q a
power of a large enough prime. Then χA (q) is the polynomial that counts points in the complement of Aq ..
Example 5.4.4. Let G = ([n], E) be a simple graph and let AG be the corresponding graphic arrangement
in Rn . Note that AG reduces correctly over every finite field Fq (because graphic matroids are regular).
A point (x1 , . . . , xn ) ∈ Fnq can be regarded as the q-coloring of G that assigns color xi to vertex i. The
proper q-colorings are precisely the points of Fnq \ AG . The number of such colorings is pG (k) (the chromatic
polynomial of G evaluated at q). On the other hand, by Theorem 5.4.1, it is also the characteristic polynomial
χAG (q). Since pG (k) = χAG (q) for infinitely many q (namely, all integer prime powers), the polynomials
must be equal. J
Example 5.4.5. The Shi arrangement is the arrangement of n(n − 1) hyperplanes in Rn defined by
Shin = {xi = xj , xi = xj + 1 | 1 ≤ i < j ≤ n} .
In other words, take the braid arrangement, clone it, and nudge each of the cloned hyperplanes a little bit in
the direction of the bigger coordinate. The Shi arrangement has rank n − 1 (every hyperplane in it contains a
line parallel to the all-ones vector), so we may project along that line to obtain the essentialization in Rn−1 .
Thus ess(Shi2 ) consists of two points on a line, while ess(Shi3 ) is shown below.
y =z+1
y=z
ess(Shi3 )
x=z+1
x=z
x=y x=y+1
(The number (n + 1)n−1 may look familiar; by Cayley’s formula, it is the number of spanning trees of
the complete graph Kn+1 . It also counts many other things of combinatorial interest, including parking
functions.)
The following proof is from [Sta07, §5.2]. By Theorem 5.4.3, it suffices to count the points in Fnq \ Shin for
a large enough prime q. Let x = (x1 , . . . , xn ) ∈ Fnq \ Shin . Draw a necklace with q beads labeled by the
elements 0, 1, . . . , q − 1 ∈ Fq , and for each k ∈ [n], put a big red k on the xk -th bead. For example, let n = 6
and q = 11. Then the necklace for x = (2, 5, 6, 10, 3, 7) is as follows:
100
0
10 1
4
9 2
1
8 5 3
6
7 4
3 2
6 5
The requirement that x avoids the hyperplanes xi = xj implies that the red numbers are all on different
beads. If we read the red numbers clockwise, starting at 1 and putting in a divider sign | for each bead
without a red number, we get
15 | 236 | | 4 |
which can be regarded as the ordered weak partition (or OWP)
that is, a (q − n)-tuple B1 , . . . , Bq−n , where the Bi are pairwise disjoint sets (possibly empty; that’s what
the “weak” means) whose union is [n], and 1 ∈ B1 . (We’ve omitted the divider corresponding to the bead
just counterclockwise of 1; stay tuned.)
Note that each block of Π(x) corresponds to a contiguous set of values among the coordinates of x. For
example, the block 236 occurs because the values 5,6,7 occur in coordinates x2 , x3 , x6 . In order to avoid the
hyperplanes xi = xj + 1 for i < j, each contiguous block of beads must have its red numbers in strictly
increasing order counterclockwise. (In particular the bead just counterclockwise of 1 must be unlabeled,
which is why we could omit that divider.)
To get a necklace from an OWP, write out each block in increasing order, with bars between successive
blocks.
Meanwhile, an OWP is given by a function f : [n] → [q − n], where f (i) is the index of the block containing i
(so f (1) = 1). There are (q − n)n−1 such things. Since there are q choices for the bead containing the red
1, we obtain
Fnq \ Shin = q(q − n)n−1 = χShin (q).
This proves (5.8), and (5.9) follows from Zaslavsky’s theorems. J
We have seen that for a simple graph G = ([n], E), the chromatic polynomial pG (k) is precisely the char-
acteristic polynomial of the graphic arrangement AG . For some graphs, the chromatic polynomial factors
into linear terms over Z. For example, if G = Kn , then pG (k) = k(k − 1)(k − 2) · · · (k − n + 1), and if G is
101
a forest with n vertices and c components, then pG (k) = k c (k − 1)n−c . This property does not hold for all
graphs. For example, it is easy to work out that the chromatic polynomial of C4 (the cycle with four vertices
and four edges) is k 4 − 4k 3 + 6k 2 − 3k = k(k − 1)(k 2 − 3k + k), which does not factor further over Z. Is
there a structural condition on a graph or a central arrangement (or really, on a geometric lattice) that will
guarantee that its characteristic polynomial factors completely? It turns out that supersolvable geometric
lattices have this good property.
Definition 5.5.1. Let L be a ranked lattice. An element x ∈ L is a modular element if r(x) + r(y) =
r(x ∨ y) + r(x ∧ y) for every y ∈ L.
For example:
• By Theorem 1.5.6, a ranked lattice L is modular iff all elements are modular.
• The elements 0̂ and 1̂ are clearly modular in any lattice.
• If L is geometric, then every atom x is modular. Indeed, for y ∈ L, if y ≥ x, then y = x ∨ y and
x = x ∧ y, while if y 6≥ x then y ∧ x = 0̂ and y ∨ x m y.
• The coatoms of a geometric lattice need not be modular. For example, let L = Πn , and recall that
Πn has rank function r(π) = n − |π|. Let x = 12|34, y = 13|24 ∈ Π4 . Then r(x) = r(y) = 2, but
r(x ∨ y) = r(1̂) = 3 and r(x ∧ y) = r(0̂) = 0. So x is not a modular element.
Proposition 5.5.2. The modular elements of Πn are exactly the partitions with at most one nonsingleton
block.
X = {C ∈ σ : C ∩ B 6= ∅}, Y = {C ∈ σ : C ∩ B = ∅}.
Then ( )
n o n o [
π ∧ σ = C ∩ B : C ∈ X ∪ {i} : i 6∈ B , π∨σ = C ∪Y
C∈X
so
|π ∧ σ| + |π ∨ σ| = (|X| + n − |B|) + (1 + |Y |)
= (n − |B| + 1) + (|X| + |Y |) = |π| + |σ|,
For the converse, suppose B, C are nonsingleton blocks of π, with i, j ∈ B and k, ` ∈ C. Let σ be the
partition with exactly two nonsingleton blocks {i, k}, {j, `}. Then r(σ) = 2 and r(π ∧ σ) = r(0̂) = 0, but
Modular elements are useful because they lead to factorizations of the characteristic polynomial of L.
Theorem 5.5.3. Let L be a geometric lattice of rank n, and let z ∈ L be a modular element. Then
X
χL (k) = χ[0̂,z] (k) µL (0̂, y)k n−r(z)−r(y) . (5.10)
y: y∧z=0̂
102
Here is a sketch of the proof; for the full details, see [Sta07, pp. 440–441]. We work in the dual Möbius algebra
A∗ (L) = A(L∗ ); that is, the vector space of C-linear combinations of elements of L, with multiplication given
by join (rather than meet as in §2.4). Thus the “algebraic” basis of A∗ (L) is
def X
{σy ≡ µ(y, x)x : y ∈ L}.
x: x≥y
for any z ∈ L. Second, for z, y, v ∈ L such that z is modular, v ≤ z, and y ∧ z = 0, one shows first that
z ∧ (v ∨ y) = v (by rank considerations) and then that rank(v ∨ y) = rank(v) + rank(y). Third, make
the substitutions v 7→ k rank z−rank v and y 7→ k n−rank y−rank z in the two sums on the RHS of (5.11). Since
vy = v ∨ y, the last observation implies that substituting x 7→ k n−rank x on the LHS preserves the product,
and the equation becomes (5.10).
χ
P tell us anything new, because we already knew that k − 1 had to be a factor of L (k),
This does not really
because χL (1) = x∈L µL (0̂, x) = 0. Also, the sum in the expression is not the characteristic polynomial of
a lattice.
On the other hand, if we have a modular coatom, then Theorem 5.5.3 is much more useful, since we can
identify an interesting linear factor and describe what is left after factoring it out.
Corollary 5.5.4. Let L be a geometric lattice, and let z ∈ L be a coatom that is a modular element. Then
χL (k) = (k − e)χ[0̂,z] (k),
If we are extremely lucky, then L will have a saturated chain of modular elements
0̂ = x0 l x1 l · · · l xn−1 l xn = 1̂.
In this case, we can apply Corollary 5.5.4 successively with z = xn−1 , z = xn−2 , . . . , z = x1 to split the
characteristic polynomial completely into linear factors:
χL (k) = (k − en−1 )χ[0̂,x ] (k)
n−1
where
103
Definition 5.5.5. A geometric lattice L is supersolvable if it has a modular chain, that is, a maximal
chain 0̂ = x0 l x1 l · · · l xn = 1̂ such that every xi is a modular element. A central hyperplane arrangement
A is called supersolvable if L(A) is supersolvable.
Example 5.5.6. Every modular lattice is supersolvable, because every maximal chain is modular. In
particular, the characteristic polynomial of every modular lattice splits into linear factors. J
Example 5.5.7. The partition lattice Πn (and therefore the associated hyperplane arrangement Brn ) is
supersolvable by induction. Let z be the coatom with blocks [n − 1] and {n}, which is a modular element by
Proposition 5.5.2. There are n − 1 atoms a 6≤ z, namely the partitions whose non-singleton block is {i, n}
for some i ∈ [n − 1], so we obtain
χΠn (k) = (k − n + 1)χΠn−1 (k)
and by induction
χΠn (k) = (k − 1)(k − 2) · · · (k − n + 1).
J
Example 5.5.8. Let G = C4 (a cycle with four vertices and four edges), and let A = AG . Then L(A) is
the lattice of flats of the matroid U3 (4); i.e.,
L = {F ⊆ [4] : |F | =
6 3}
with r(F ) = min(|F |, 3). This lattice is not supersolvable, because no element at rank 2 is modular. For
example, let x = 12 and y = 34; then r(x) = r(y) = 2 but r(x ∨ y) = 3 and r(x ∧ y) = 0. (We have already
seen that the characteristic polynomial of L does not split.) J
Theorem 5.5.9. Let G = (V, E) be a simple graph. Then AG is supersolvable if and only if the vertices of
G can be ordered v1 , . . . , vn such that for every i > 1, the set
Ci := {vj : j ≤ i, vi vj ∈ E}
forms a clique in G.
Such an ordering is called a perfect elimination ordering. The proof of Theorem 5.5.9 is left as an exercise
(see Stanley, pp. 55–57). An equivalent condition is that G is a chordal graph: if C ⊆ G is a cycle of
length ≥ 4, then some pair of vertices that are not adjacent in C are in fact adjacent in G. This equivalence is
sometimes known as Dirac’s theorem. It is fairly easy to prove that supersolvable graphs are chordal, but the
converse is somewhat harder; see, e.g., [Wes96, pp. 224–226]. There are other graph-theoretic formulations of
this property; see, e.g., [Dir61]. See the recent paper [HS15] for much more about factoring the characteristic
polynomial of lattices in general.
If G satisfies the condition of Theorem 5.5.9, then we can see directly why its chromatic polynomial χ(G; k)
splits into linear factors. Consider what happens when we color the vertices in order. When we color vertex
vi , it has |Ci | neighbors that have already been colored, and they all have received different colors because
they form a clique. Therefore, there are k − |Ci | possible colors available for vi , and we see that
n
Y
χ(G; k) = (k − |Ci |).
i=1
104
Theorem 5.6.1 (Brieskorn [Bri73]). The homology groups Hi (X, Z) are free abelian, and the Poincáre
polynomial of X is the characteristic polynomial backwards:
n
X
rankZ Hi (X, Z)q i = (−q)n χL(A) (−1/q).
i=0
In a very famous paper, Orlik and Solomon [OS80] strengthened Brieskorn’s result by giving a presentation
of the cohomology ring H ∗ (X, Z) in terms of L(A), thereby proving that the cohomology is a combinatorial
invariant of A. (Brieskorn’s theorem says only that the additive structure of H ∗ (X, Z) is a combinatorial
invariant.) By the way, the homotopy type of X is not a combinatorial invariant; Rybnikov [Ryb11] con-
structed arrangements with isomorphic lattices of flats but different fundamental groups. There is much
more to say on this topic!
Consider the two arrangements A1 , A2 ⊂ R2 shown in Figure 5.4. Their intersection posets are isomorphic
(the labelings in the figure give an isomorphism). Therefore, by Zaslavsky’s theorems they have the same
numbers of regions and bounded regions (this can of course be checked directly). However, there is good
reason not to consider the two arrangements isomorphic. For example, both bounded regions in A1 are
triangles, while A2 has a triangle and a trapezoid. Also, the point H1 ∩ H2 ∩ H4 lies between the lines H3
and H5 in A1 , while it lies below both of them in A2 . The intersection poset lacks the power to model
geometric data like “between,” “below,” “triangle” and “trapezoid.” Accordingly, we need to define a
stronger combinatorial invariant.
H1 H1
H5 t s
H3 q r H3 q r
H4 H4
p p
s t
H5
A1 A2
H2 H2
First we fix notation. Let A = {H1 , . . . , Hn } be an essential hyperplane arrangement in Rd , with normal
vectors n1 , . . . , nn . For each i, let λi be an affine linear functional on Rn such that Hi = {x ∈ Rd : λi (x) =
0}. (If ∩A = {~0} then we may define λi (x) = ni · x.)
The intersections of hyperplanes in A, together with its regions, decompose Rd as a polyhedral cell
complex: a disjoint union of polyhedra, each homeomorphic to Re for some e ≤ d (that’s what “cell” means),
such that the boundary of any cell is a union of other cells. We can encode each cell by recording whether
the linear functionals λ1 , . . . , λn are positive, negative or zero on it. Specifically, for k = (k1 , . . . , kn ) ∈
105
{+, −, 0}n , define a (possibly empty) subset of Rd by
λi (x) > 0 if ki = + ⇐⇒ i ∈ k+
F = F (k) = x ∈ Rd λi (x) < 0 if ki = − ⇐⇒ i ∈ k− .
λi (x) = 0 if ki = 0 ⇐⇒ i ∈ k0
This formula can be taken as the definition of k+ , k− , and k0 . A convenient shorthand (“digital notation”)
is to represent k by the list of digits i for which ki 6= 0, placing a bar over the digits for which ki < 0. For
instance, k = 0 + −00 − +0 would be abbreviated 23̄6̄7; here k+ = {2, 7} and k− = {3, 6}.
If F 6= ∅ then it is called a face of A, and k = k(F ) is the corresponding covector. The set of all faces is
denoted F (A). The poset Fˆ (A) = F (A) ∪ {0̂, 1̂}, ordered by containment of closures (F ≤ F 0 if F̄ ⊆ F̄ 0 ),
is a lattice, called the (big) face lattice1 of A. If A is central, then F (A) already has a unique minimal
element and we don’t add an extra one. For example, the big face lattice of Bool2 is shown in Figure 5.5.
2
1̂
Bool2 ∅ F (Bool2 )
2̄
Figure 5.5: The Boolean arrangement Bool2 and its big face lattice.
Combinatorially, the order relation in F (A) is given by k ≤ l if k+ ⊆ l+ and k− ⊆ l− . (This is very easy to
read off using digital notation.) The maximal covectors (or topes) are precisely those with no zeroes; they
correspond to the regions of A.
The big face lattice captures more of the geometry of A than the intersection poset; for instance, the two
arrangements A1 , A2 shown above have isomorphic intersection posets but non-isomorphic face lattices.
(This may be clear to you now; there are lots of possible explanations and we’ll see one soon.)
The faces of the braid arrangement Brn (see Example 5.1.4) have an explicit combinatorial description in
terms of set compositions. If F is a face, then F lies either below, above, or on each hyperplane Hij — i.e.,
either xi < xj , xi = xj , or xi > xj holds on F — and this data describes F exactly. In fact, we can record F
by a set composition of [n], i.e., an ordered list A of nonempty sets A1 | . . . |Ak whose disjoint union is [n].
(We write A |= [n] for short.) For example, the set composition
569 | 3 | 14 | 28 | 7
1 That is, the big lattice of faces, not the lattice of big faces.
106
x=z
x<z
1|3|2
x>z
3|1|2 13|2 1|23 1|2|3 1|2|3 1|3|2 2|1|3 3|1|2 2|3|1 3|2|1
123 x<y
3|12 12|3 x=y 1|23 12|3 13|2 2|13 3|12 23|1
x>y
y>z
2|3|1
y<z
y=z
Figure 5.6: Br3 and its big face lattice (the lattice of set compositions).
In the extreme case that dim A = n, the set composition has only singleton parts, hence is equivalent to a
permutation (this confirms what we already know, that Brn has n! regions).
The correspondence between faces of Brn and set compositions A |= [n] is a bijection. In fact, the big face
lattice of Brn is isomorphic to the lattice of set compositions ordered by refinement; see Figure 5.6.
Consider a system of linear equalities and inequalities of the form xi = xj and xi < xj . If such a system
is consistent, it gives rise to a nonempty polyhedron that is a convex union of faces of Br9 . Such a system
can be described by a preposet, which is a relation < on [n] that is reflexive, transitive, but not necessarily
antisymmetric (compare Defn. 1.1.1). In other words, x ≤ y and y ≤ x does not imply x = y. This relation
has a Hasse diagram, just like a poset, except that multiple elements of the ground set can be put in the
same “box” (whenever there is a failure of antisymmetry). For example, the system
x1 = x5 , x4 < x6 , x5 < x7 , x6 = x8 , x2 = x6 , x9 < x8 , x2 < x7 .
corresponds to the preposet whose Hasse diagram is
3 15 268
4 9
107
and this gives rise to a 6-dimensional convex polyhedron P consisting of faces of Br9 . (Each box in the Hasse
diagram represents a coordinate that can vary (locally) freely, which is why the dimension is 6.) The maximal
faces in P correspond to the linear extensions of the preposet, expressed as set compositions: 15|3|4|9|268|7,
4|9|268|3|15|7, etc.
Oriented matroids are a vast topic; these notes just scratch the surface. The canonical resource is the
book [BLVS+ 99]; an excellent free source is Reiner’s lecture notes [Rei] and another good brief reference
is [RGZ97].
Consider the linear forms λi that were used in representing each face by a covector. Recall that specifying λi
is equivalent to specifying a normal vector ni to the hyperplane Hi (with λi (x) = ni · x). As we know,
the vectors ni represent a matroid whose lattice of flats is precisely L(A). Scaling ni (equivalently, λi ) by
a nonzero constant c ∈ R has no effect on the matroid represented by the ni ’s, but what does it do to
the covectors? If c > 0, then nothing happens, but if c < 0, then we have to switch + and − signs in
the ith position of every covector. So, in order to figure out the covectors, we need not just the normal
vectors ni , but an orientation for each one — hence the term “oriented matroid”. Equivalently, for each
hyperplane Hi , we are designating one of the two corresponding halfspaces (i.e., connected components of
Rd \ Hi ) as positive and the other as negative.
See Figure 5.7 for examples. (The normal vectors all have positive z-coordinate, so “above” means “above.”)
For instance, the trapezoidal bounded region in A2 has covector + + + + − because it lies above hyperplanes
H1 , H2 , H3 , H4 but below H5 . Its top side has covector + + + + 0, its bottom + + 0 + −, etc.
H1 H1
−++++ +++++ +−+++
H5 s
t
−++++ +++++ +−+++ −+++− ++++− +−++−
H3 q r
H3 q r
++−++ ++−+−
−+−++ +−−++ −+−+− +−−+−
H4 H4
−+−−+
p +−−−+
p
s −−−++
t
H5 −+−−− −−−−− +−−−−
A1 A2
H2 H2
Proposition 5.9.1. Suppose that no two hyperplanes in A are parallel. Then the maximal covectors whose
negatives are also covectors are precisely those that correspond to relatively-unbounded faces. In particular,
108
A is central if and only if every negative of a covector is a covector.
Suppose R is an unbounded region, with k the corresponding covector. Fix a point x ∈ R and choose a
direction v in which R is unbounded. By perturbing v slightly, we can assume that v is not orthogonal to
any normal vector ni for which ki = 0. (This perturbation step is where we use the assumption that no
two hyperplanes are parallel.) In other words, if we walk in the direction of v then the values of λi increase
without bound, decrease without bound, or remain zero according as i belongs to k+ , k− , or k0 . But then if
we walk in the direction of −ni , then “increase” and “decrease” are reversed. Therefore, walking sufficiently
far in that direction arrives in an (unbounded) region with covector −k.
Conversely, suppose that k and −k are covectors of regions R and S. Pick points x ∈ R and y ∈ S and
consider the line ` joining x and y. The functionals λi are identically zero on ` for i ∈ k0 = (−k)0 , but
otherwise increase or decrease (necessarily without bound). Therefore the ray pointing from x away from y
(resp., from y away from x) is contained in R (resp., S). It follows that both R and S are unbounded.
The second assertion now follows from Corollary 5.3.10. WHICH IS FALSE
(It would be nice to modify the statement to handle the case that A has parallel hyperplanes. Here the
conclusion fails, since for example in A1 or A2 above, every ray in the region with covector + − − − + is
horizontal, hence orthogonal to the normals to H3 , H4 .H5 , so the functionals λ3 , λ4 are constant and positive
— hence do not become negative upon walking in the other direction; the “opposite” unbounded region has
covector − + − − +. It is still true that any pair of opposite covectors correspond to opposite unbounded
regions, but I think this condition holds only for unbounded regions that contain more than one direction’s
worth of rays.)
Just like circuits, bases, etc., of a matroid, oriented matroid covectors can be axiomatized purely combina-
torially. First some preliminaries. For k, l ∈ {+, 0, −}n , define the composition k ◦ l by
(
ki if ki 6= 0,
(k ◦ l)i =
li if ki = 0.
The axioms are as follows [RGZ97, §7.2.1]: a collection K ⊆ {+, −, 0}n is a covector system if for all
k, l ∈ K :
(K1) ~0 = (0, 0, . . . , 0) ∈ K ;
(K2) −k ∈ K ;
(K3) k◦l∈K ;
(K4) If i ∈ S(k, l) then there exists m ∈ K with (a) mi = 0 and (b) mj = (k ◦ l)j for j ∈ [n] \ S(k, l).
Note that (K1) and (K2) are really properties of central hyperplane arrangements. However, any non-
central arrangement A can be turned into a central one by coning (see Definition 5.1.8), and if K (A) is the
set of covectors of A then
K (cA) = {(k, +) : k ∈ K (A)} ∪ {(−k, −) : k ∈ K (A)} ∪ {~0}
and by the way, K (A) = {k : (k, +) ∈ K (cA)}.
109
5.9.2 Oriented matroid circuits
The cones over the arrangements A1 and A2 (not including the new hyperplane introduced in coning) are
central, essential arrangements in R3 , whose matroids of normals can be represented respectively by the
matrices
Evidently the matroids represented by X1 and X2 are isomorphic, with circuit system {124, 345, 1235}.
However, they are not isomorphic as oriented matroids. The minimal linear dependencies realizing the
circuits in each case are
An oriented circuit keeps track not just of minimal linear dependencies, but of how to orient the vectors
in the circuit so that all the signs are positive. Thus 124̄ is a oriented circuit in both cases. However, in the
first case 34̄5 is a circuit, while in the second it is 34̄5̄. Note that if c is a circuit then so is −c, where, e.g.,
−124̄ = 1̄2̄4. In summary, the oriented circuit systems for CA1 and CA2 are respectively
Oriented circuits are minimal obstructions to covector-ness. For example, 124̄ is a circuit of A1 because the
linear functionals defining its hyperplanes satisfy λ1 + λ2 − 2λ4 = 0. But if a covector of A1 contains 124̄,
then any point in the corresponding face of A would have λ1 , λ2 , −λ4 all positive, which is impossible.
(OC1) ~0 6∈ C~.
(OC2) −c ∈ C~.
(OC3) Either c+ 6⊆ c0+ or c− 6⊆ c0− .
(OC4) If ci = + and c0i = −, then there exists d ∈ C~ such that (a) di = 0 and (b) for all j 6= i, d+ ⊆ c+ ∪ c0+
and d− ⊆ c− ∪ c0− .
Again, the idea is to record not just the linearly dependent subsets of a set {λi , . . . , λn } of linear forms, but
also the sign patterns of the corresponding linear dependences (“syzygies”). The first two are elementary:
(OC1) says that the empty set is linearly independent and (OC2) says that multiplying any syzygy by
−1 gives a syzygy. Condition (OC3) must hold if we want circuits to record signed syzygies with minimal
support, as for circuits in an unoriented matroid,
110
(OC4) is the oriented version of circuit exchange. Suppose that we have two syzygies
n
X n
X
γj λj = γj0 λj = 0
j=1 j=1
with γi > 0 and γi0 < 0 for some i. Multiplying by positive scalars if necessary (hence not changing the sign
patterns), we may assume that γi = −γi0 . Then adding the two syzygies gives
n
X
δj λj = 0,
j=1
where δj = γj + γj0 . In particular, δi = 0, and δj is positive (resp., negative) if and only if at least one of
γj , γj0 is positive (resp., negative).
Remark 5.9.3. If C~ is an oriented circuit system, then C = {c+ ∪ c− : c ∈ C~} is a circuit system for an
ordinary matroid with ground set [n]. (I.e., just erase all the bars.) This is called the underlying matroid
of the oriented matroid with circuit system C~.
As in the unoriented setting, the circuits of an oriented matroid represent minimal obstructions to being a
covector. That is, every real hyperplane arrangement A gives rise to an oriented circuit system C~ such that
if k is a covector of A and c is a circuit, then it is not the case that k+ ⊇ c+ and k− ⊇ c− .
More generally, one can construct an oriented matroid from any real pseudosphere arrangement, or collection
of homotopy (d − 1)-spheres embedded in Rn such that the intersection of the closures of the spheres in any
subcollection is either connected or empty — i.e., a thing like this:
Again this arrangement gives rise to a cellular decomposition of Rn , and each cell corresponds to a covector
which describes whether the cell is inside, outside, or on each pseudocircle.
In fact, the Topological Representation Theorem of Folkman and Lawrence (1978) says that every combi-
natorial oriented matroid can be represented by such a pseudosphere arrangement. However, there exist
oriented matroids that cannot be represented as hyperplane arrangements. For example, recall the construc-
tion of the non-Pappus matroid (Example 3.5.7). If we bend the line xyz a little so that it meets x and y
but not z (and no other points), the result is a pseudoline arrangement whose oriented matroid M cannot
be represented by means of a line arrangement.
111
5.9.3 Oriented matroids from graphs
Recall (§3.3) that every graph G = (V, E) gives rise to a graphic matroid M (G) with ground set E. Corre-
spondingly, every directed graph G ~ gives rise to an oriented matroid, whose circuit system C~ is the family
of oriented cycles. This is best shown by an example.
2
Oriented circuits
For example, 135̄ is a circuit because the clockwise orientation of the northwest triangle in G includes edges
1 and 3 forward, and edge 5 backward. In fact, this circuit system is identical to the circuit system C1
seen previously. More generally, for every oriented graph G, ~ the signed set system C~ formed in this way
satisfies the axioms of Definition 5.9.2. To understand axiom (4) of that definition, suppose e is an edge that
occurs forward in c and backward in c0 . Then c − e and c0 − e are paths between the two endpoints of e,
with opposite starting and ending points, so when concatenated, they form an closed walk in G, ~ which must
contain an oriented cycle.
Reversing the orientation of edge e corresponds to interchanging e and ē in the circuit system; this is called
a reorientation. For example, reversing edge 5 produces the previously seen oriented circuit system C2 .
An oriented matroid is called acyclic if every circuit has at least one barred and at least one unbarred
~ having no directed cycles (i.e., being an acyclic orientation of its underlying
element; this is equivalent to G
graph G). In fact, for any ordinary unoriented matroid M , one can define an orientation of M as an
oriented matroid whose underlying matroid is M ; the number of acyclic orientations is TM (2, 0) [Rei, §3.1.6,
p.29], just as for graphs.
The covectors of the circuit system for a directed graph are in fact the faces of the (essentialization of) the
graphic arrangement associated to G,~ in which the orientation of each edge determines the orientation of
~ then the hyperplane xi = xj is assigned the normal
the corresponding normal vector — if ı~ is an edge in G
vector ei − ej . The maximal covectors are precisely the regions of the graphic arrangement.
5.10 Exercises
Exercise 5.1. Let m > n, and let A be the arrangement of m affine hyperplanes in general position in Rn .
Here “general position” means that every k of the hyperplanes intersect in an affine linear space of dimension
n − k; if k > n then the intersection is empty. (Compare Example 5.3.9, where the hyperplanes are linear.)
Calculate χA (k), r(A), and b(A).
Exercise 5.2. (Stanley, HA, 2.5) Let G be a graph on n vertices, let AG be its graphic arrangement in Rn ,
and let BG = Booln ∪ AG . (That is, B consists of the coordinate hyperplanes xi = 0 in Rn together with the
hyperplanes xi = xj for all edges ij of G.) Calculate χBG (q) in terms of χAG (q).
Exercise 5.3. (Stanley, EC2, 3.115) Determine the characteristic polynomial and the number of regions of
112
the type B braid arrangement and the type D braid arrangement Bn , Dn ⊂ Rn , which are defined by
Prove that F (W ) is a convex set if and only if W is the set of linear extensions of some poset P on [n]. (A
linear extension of P is a total ordering ≺ consistent with the ordering of P , i.e., if x <P y then x ≺ y.)
Exercise 5.6. The runners in a sprint are seeded 1, . . . , n (stronger runners are assigned higher numbers).
To even the playing field, the rules specify that you earn one point for each higher-ranked opponent you
beat, and one point for each lower-ranked opponent you beat by at least one second. (If a higher-ranked
runner beats a lower-ranked runner by less than 1 second, no one gets the point for that matchup.) Let si
be the number of points scored by the ith player and let s = (s1 , . . . , sn ) be the score vector.
(a) Show that the possible score vectors are in bijection with the regions of the Shi arrangement.
(b) Work out all possible score vectors in the cases of 2 and 3 players. Conjecture a necessary and sufficient
condition for (s1 , . . . , sn ) to be a possible score vector for n players. Prove it if you can.
Exercise 5.7. Prove Theorem 5.5.9.
113
Chapter 6
Simplicial Complexes
The canonical references for this material are [Sta96], [BH93, Ch. 5]. See also [MS05] (for the combinatorics
and algebra) and [Hat02] (for the topology).
The dimension of a face σ is dim σ = |σ| − 1. A face of dimension k is a k-face or k-simplex. The
dimension of a non-void simplicial complex ∆ is dim ∆ = max{dim σ : σ ∈ ∆}. (Sometimes we write
∆d−1 to indicate that dim ∆ = d − 1; this is a common convention since then d is the maximum number of
vertices in a face.) A complex is pure if all its facets have the same dimension.
The simplest simplicial complexes are the void complex ∆ = ∅ (which is often excluded from consideration)
and the irrelevant complex ∆ = {∅}. In some contexts, there is the additional requirement that every
singleton subset of V is a face (since if v ∈ V and {v} 6∈ ∆, then v 6∈ σ for all σ ∈ ∆, so you might as well
replace V with V \ {v}). A simplicial complex with a single facet is also called a simplex.
The set of facets of a complex is the unique minimal set of generators for it.
Simplicial complexes are combinatorial models for compact topological spaces. The vertices V = [n] can
be regarded as the points e1 , . . . , en ∈ Rn , and a simplex σ = {v1 , . . . , vr } is then the convex hull of the
corresponding points:
114
For example, faces of sizes 1, 2, and 3 correspond respectively to vertices, line segments, and triangles. (This
explains why dim σ = |σ| − 1.) Taking {ei } to be the standard basis of Rn gives the standard geometric
realization |∆| of ∆: [
|∆| = conv{ei : i ∈ σ}.
σ∈∆
It is usually possible to realize ∆ geometrically in a space of much smaller dimension. For example, every
graph can be realized in R3 , and planar graphs can be realized in R2 . It is common to draw geometric
pictures of simplicial complexes, just as we draw pictures of graphs. We sometimes use the notation |∆| to
denote any old geometric realization (i.e., any topological space homeomorphic to the standard geometric
realization). Typically, it is easiest to ignore the distinction between ∆ and |∆|; if we want to be specific we
will use terminology like “geometric realization of ∆” or “face poset of ∆”. A triangulation of a topological
space X is a simplicial complex whose geometric realization is homeomorphic to X.
Here are geometric realizations of the simplicial complexes ∆1 = h124, 23, 24, 34i and ∆2 = h12, 14, 23, 24, 34i.
2 2
1 3 1 3
∆1 4 4 ∆2
The filled-in triangle indicates that 124 is a face of ∆1 , but not of ∆2 . Note that ∆2 is the subcomplex of
∆1 consisting of all faces of dimensions ≤ 1 — that is, it is the 1-skeleton of ∆1 .
115
For instance, if ∆1 , ∆2 are the simplicial complexes pictured above, then
Example 6.1.3. Let P be a finite poset and let ∆(P ) be the set of chains in P . Every subset of a chain is a
chain, so ∆(P ) is a simplicial complex, called the order complex of P . The minimal nonfaces of ∆(P ) are
precisely the pairs of incomparable elements of P ; in particular every minimal nonface has size two, which
is to say that ∆(P ) is a flag complex. Note that ∆(P ) is pure if and only if P is ranked.
If P itself is the set of faces of a simplicial complex ∆, then ∆(P (∆)) is the barycentric subdivision of
that complex. Combinatorially, the vertices of Sd(∆) correspond to the faces of ∆; a collection of vertices
of Sd(∆) forms a face if the corresponding faces of ∆ are a chain in its face poset. Topologically, Sd(∆)
can be constructed by drawing a vertex in the middle of each face of ∆ and connecting them — this is best
illustrated by a picture.
∆ Sd (∆)
Each vertex (black, red, blue) of Sd(∆) corresponds to a (vertex, edge, triangle) face of ∆. Note that
barycentric subdivision does not change the topological space itself, only the triangulation of it. J
Simplicial complexes are models of topological spaces, and combinatorialists use tools from algebraic topology
to study them, in particular the machinery of simplicial homology. Here we give a “user’s guide” to the subject
that assumes as little topology background as possible. Readers familiar with the subject will know that I
am leaving many things out. For a full theoretical treatment, I recommend Chapter 2 of Hatcher [Hat02].
Let ∆ be a simplicial complex on vertex set [n]. The kth simplicial chain group of ∆ over a field1 , say R,
is the vector space Ck (∆) of formal linear combinations of k-simplices in ∆. Thus dim Ck (∆) = fk (∆).
The elements of Ck (∆) are called k-chains. The (simplicial) boundary map ∂k : Ck (∆) → Ck−1 (∆) is
defined as follows: if σ = {v0 , . . . , vk } is a k-face, with 1 ≤ v0 < · · · < vk ≤ n, then
k
X
∂k [σ] = (−1)i [v0 , . . . , vbi , . . . , vk ] ∈ Ck−1 (∆)
i=0
where the hat denotes removal. The map is then extended linearly to all of Ck (∆).
116
expressed as a sum of (k − 1)-simplices with consistent orientations (as represented by the signs). Often it
is convenient to abbreviate ∂k by ∂, since either the subscript is clear from context or else we want to say
something about all boundary maps at once.
The entire collection of data {Ck (∆, ∂k )} is called the simplicial chain complex of ∆. For example, if
∆ = h123, 14, 24i, then the simplicial chain complex is as follows:
∂ ∂ ∂
C2 = R1 −−−−−2−−−→ C1 = R5 −−−−−−−−−−−−−
1
−−−−−−−−−−→ C0 = R4 −−−−−−−−0−−−−−−→ C−1 = R
123 12 13 14 23 24 1 2 3 4
12 1 1 1 1 1 0 0 ∅ [1 1 1 1]
13 −1 2 −1 0 0 1 1
14 0 3 0 −1 0 −1 0
−1 −1
23 1 4 0 0 0
24 0
The fundamental fact about boundary maps is that ∂k ◦ ∂k+1 for all k, a fact that is frequently written
without subscripts:
∂ 2 = 0.
(This can be checked directly from the definition of ∂, and is a calculation that everyone should do for
themselves once.) This is precisely what the term “chain complex” means in algebra.
An equivalent condition is that ker ∂k ⊇ im ∂k+1 for all k. In particular, we can define the reduced
simplicial homology groups2
H̃k (∆) = ker ∂k / im ∂k+1 .
The H̃k (∆) are just R-vector spaces, so they can be described up to isomorphism by their dimensions3 , which
are called the Betti numbers βk (∆). They can be calculated using the rank-nullity formula: in general
βk (∆) = dim H̃k (∆) = dim ker ∂k − dim im ∂k+1 = fk − rank ∂k − rank ∂k+1 .
In the example above, this formula gives
β̃0 (∆) = 4 − 1 − 3 = 0, β̃1 (∆) = 5 − 3 − 1 = 1, β̃2 (∆) = 1 − 1 − 0 = 0
(note that ∂3 is the zero map).
These numbers turn out to carry topological information about the space |∆|. In fact, they depend only
on the homotopy type of the space |∆|. This is a fundamental theorem in topology whose proof is far too
elaborate to give here,4 but provides a crucial tool for studying simplicial complexes: we can now ask how
the topology of ∆ affects its combinatorics. To begin with, the groups H̃k (∆) do not depend on the choice
of labeling of vertices and are invariant under retriangulation.
A complex all of whose homology groups vanish is called acyclic. For example, if |∆| is contractible then ∆
is acyclic over every ring. If ∆ ∼
= Sd (i.e., |∆| is a d-dimensional sphere), then
(
H̃k (∆) =∼ R if k = d, (6.1)
0 if k < d.
2 The unreduced homology groups H (∆) are defined by deleting C
k −1 (∆) from the simplicial chain complex. This results in
an extra summand of R in H0 (∆) and has no effect elsewhere. Broadly speaking, reduced homology arises more naturally in
combinatorics and unreduced homology is more natural in topology, but the information is equivalent.
3 This would not be true if we replaced R with a ring that was not a field. Actually, the most information is available over Z.
In that case βk (∆) can still be obtained as the rank of the free part if H̃k (∆), but there also may be a torsion part.
4 Roughly, one defines a much more abstract set of invariants called singular homology groups, which are easily seen to
be topological invariants but are well-nigh impossible to work with directly; one then shows that repeatedly barycentrically
subdividing a space allows us to approximate singular homology by simplicial homology sufficiently accurately — but on the
other hand subdivision also preserves simplicial homology. See [Hat02, §2.1] for the full story. Or take my Math 821 class!
117
The (reduced) Euler characteristic of ∆ is
X X
χ̃(∆) = βk (∆) = fi (∆). (6.2)
k≥0 i≥−1
The second equality here is called the Euler-Poincaré theorem; despite the fancy name, it is easy to prove
using little more than the rank-nullity theorem of linear algebra (Exercise 6.9). The Euler characteristic is
the single most important numerical invariant of ∆. Many combinatorial invariants can be computed by
identifying them as the Euler characteristic of a simplicial complex whose topology is known, often one that
is acyclic (χ̃ = 0), a sphere of dimension d (χ̃ = (−1)d ), or a wedge of spheres.
Observe that
X X
χ̃(∆) = (−1)dim σ + (−1)dim σ
σ∈∆: e6∈σ σ∈∆: e∈σ
X X
dim σ
= (−1) + (−1)1+dim τ
σ∈del∆ (e) τ ∈link∆ (e)
Definition 6.3.1. Let ∆ be a simplicial complex on vertex set [n]. Its Stanley-Reisner ideal in R is
Example 6.3.2. Let ∆1 and ∆2 be the complexes on the previous page. Abbreviating w, x, y, z =
x1 , x2 , x3 , x4 , the Stanley-Reisner ideal of ∆1 is
Note that the minimal generators of I∆ are the minimal nonfaces of ∆. Similarly,
If ∆ is the simplex on [n] then it has no nonfaces, so I∆ is the zero ideal and k[∆] = k[x1 , . . . , xn ]. In general,
the more faces ∆ has, the bigger its Stanley-Reisner ring is. J
Since ∆ is a simplicial complex, the monomials in I∆ are exactly those whose support is not a face of ∆.
Therefore, the monomials supported on a face of ∆ are a natural vector space basis for the graded ring k[∆].
118
Its Hilbert series can be calculated by counting these monomials:
def X i X X
Hilb(k[∆], q) ≡ q dimk (k[∆])i = q deg(µ)
i≥0 σ∈∆ monomials µ:
supp µ=σ
X q |σ|
=
1−q
σ∈∆
d
X d
X
d i fi−1 q i (1 − q)d−i hi q i
X q i=0 i=0
= fi−1 = =
i=0
1−q (1 − q)d (1 − q)d
The numerator of this rational expression is a polynomial in q, called the h-polynomial of ∆ and written
h∆ (q), and its list of coefficients (h0 , h1 , . . . ) is called the h-vector of ∆. Clearing denominators and applying
the binomial theorem yields a formula for the h-numbers in terms of the f -numbers:
d d d d−i
d−i
X X X X
hi q i = fi−1 q i (1 − q)d−i = fi−1 q i (−1)j q j
i=0 i=0 i=0 j=0
j
d X
d−i
d−i
X
= (−1)j q i+j fi−1
i=0 j=0
j
and now extracting the q k coefficient (i.e., the summand in the second sum with j = k − i) yields
k
d−i
X
hk = (−1)k−i fi−1 . (6.4)
i=0
k − i
(Note that the upper limit of summation might as well be k instead of d, since the binomial coefficient in
the summand vanishes for i > k.) These equations can be solved to give the f ’s in terms of the h’s.
i
d−k
X
fi−1 = hk . (6.5)
i−k
k=0
So the f -vector and h-vector contain equivalent information about a complex. On the level of generating
functions, the conversions look like this [BH93, p. 213]:
X X
hi q i = fi−1 q i (1 − q)d−i , (6.6)
i i
X X
i
fi q = hi q i−1 (1 + q)d−i . (6.7)
i i
The equalities (6.4) and (6.5) can be obtained by applying the binomial theorem to the right-hand sides
of (6.6) and (6.7) and equating coefficients. Note that it is most convenient simply to sum over all i ∈ Z.
119
Let’s go back to the formula for the Hilbert series in terms of the h-vector, namely
d
X
hi q i
h∆ (q)
Hilb(k[∆], q) = = i=0 .
(1 − q)d (1 − q)d
Note that 1/(1 − q)d is just the Hilbert series of the polynomial ring k[x1 , . . . , xn ]. More generally, if R is
any graded ring and x is an indeterminate of degree 1, then
Hilb(R, q)
Hilb(R[x], q) = .
1−q
These observations suggest that we should be able to regard the Stanley-Reisner ring k[∆] as a polynomial
ring in d variables over a base ring S whose Hilbert series is the polynomial h∆ (q). In particular S would
have to be a finite-dimensional vector space and hi the dimension of its ith graded piece. Also, we should
be able to recover S by quotienting out by d linear forms, each of which would remove a factor of 1/(1 − q)
from the Hilbert series. For this to work, all of the hi ’s would have to be nonnegative, which does not always
happen. The situation is hopeless if ∆ is not pure, and even purity is not enough
Example 6.3.3. The bowtie complex is the pure 2-dimensional complex ∆ = h123, 145i shown below,
with f -vector (1, 5, 6, 2). Therefore, by (6.6), the h-polynomial is
X
hi q i = 1q 0 (1 − q)3 + 5q 1 (1 − q)2 + 6q 2 (1 − q)1 + 2q 3 (1 − q)0 = 1 + 2q − q 2
i
2 4
1
3 5
A simplicial complex where this game works is called a Cohen-Macaulay complex. This is an extremely good
condition from the algebraic point of view, but what can we say about the combinatorics of such complexes?
For certain complexes, the h-numbers themselves have a direct combinatorial interpretation. The last formula
suggests that they should enumerate facets of a pure complex in some way. Here is an important special
class of complexes where they do.
Definition 6.4.1. A pure simplicial complex ∆d−1 is shellable if its facets can be ordered F1 , . . . , Fn such
that any of the following conditions are satisfied:
1. For every i ∈ [n], the set Ψi = hFi i \ hF1 , . . . , Fi−1 i has a unique minimal element Ri .
2. For every i > 1, the complex Φi = hFi i ∩ hF1 , . . . , Fi−1 i is pure of dimension d − 2.
120
The proof of equivalence is left as an exercise.
Example 6.4.2. The bipyramid is the pure 2-dimensional complex B with 6 facets 124, 134, 234, 125,
135, 235. Vertices 1,2,3 form the “equator”; vertices 4 and 5 are the “poles”. The complex B has many
shelling orders, one of which is
The bipyramid and its shelling decomposition is shown in Figure 6.1. The new edges created upon adding
each triangle are indicated in bold. The corresponding decomposition of the face poset is
[∅, 234] ∪ [1, 124] ∪ [13, 134] ∪ [5, 235] ∪ [15, 125] ∪ [135, 135]
as shown in the figure (each face is color-coded according to the interval [Ri , Fi ] that contains it). J
14 12 34 24 23 13 15 25 35
1 3
2 1 4 2 3 5
5
∅
Figure 6.2 shows another example that shows how a shelling builds up a simplicial complex (in this case the
boundary of an octahedron) one step at a time. Note that each time a new triangle is attached, there is a
unique minimal new face.
Proposition 6.4.3. Let ∆d−1 be shellable, with h-vector (h0 , . . . , hd ). Then
hj = #{Fi : #Ri = j}
= # Fi : hFi i ∩ hF1 , . . . , Fi−1 i has j faces of dimension d − 2 .
The proof is left as an exercise. One consequence is that the h-vector of a shellable complex is strictly
nonnegative, since its coefficients count something. This statement is emphatically not true about the Hilbert
series of arbitrary graded rings, or even arbitrary Stanley-Reisner rings of pure complexes (see Example 6.3.3
above).
If a simplicial complex is shellable, then its Stanley-Reisner ring is Cohen-Macaulay (CM). This is an
important and subtle algebraic condition that can be expressed algebraically in terms of depth or local
cohomology (topics beyond the scope of these notes) or in terms of simplicial homology (coming shortly).
Shellability is the most common combinatorial technique for proving that a ring is CM. The constraints on
the h-vectors of CM complexes are the same as those on shellable complexes, although it is an open problem
to give a general combinatorial interpretation of the h-vector of a CM complex.
121
a d a d
2 2
1 1
c b c b
a a d
3 4 3
2
1 1
c b c b e e
7
f a d f a d f a d
5 2 5 2 5 2
1 1 1
c b c b c b
4 3 6 4 3 6 4 3
e e e
8
i Fi Ri |Ri |
7 1 abc ∅ 0
f a d
2 abd d 1
5 2 3 bde e 1
1 4 bce ce 2
c b 5 acf f 1
6 3 6 cef ef 2
4
7 adf df 2
8 adf def 3
e
Figure 6.2: A step-by-step shelling of the octahedron with vertices a,b,c,d,e,f. Facets are labeled 1 . . . 8 in
shelling order. Enumerating the sets Ri by cardinality gives the h-vector (1, 3, 3, 1).
122
Proposition 6.4.4 (Reisner’s theorem). A simplicial complex ∆ is Cohen-Macaulay over R iff (a) ∆ is
pure (so that dim link∆ (σ) = dim ∆ − dim σ − 1 for all σ) and (b) for every face σ ∈ ∆, one has
H̃k (link∆ (σ); R) = 0 ∀k < dim ∆ − dim σ − 1.
Reisner’s theorem can be used to prove that shellable complexes are Cohen-Macaulay. The other ingredient
of this proof is a Mayer-Vietoris sequence, which is a standard tool in topology that functions sort of like an
inclusion/exclusion principle for homology groups, relating the homology groups of X, Y , X ∪ Y and X ∩ Y .
Here we can take X to be the subcomplex generated by the first n − 1 facets in shelling order and Y the nth
facet; the shelling condition says that the intersections and their links are extremely well-behaved, so that
Reisner’s condition can be established by induction on n.
Reisner’s theorem often functions as a working definition of the Cohen-Macaulay condition for combinatori-
alists. The vanishing condition says that every link has the homology type of a wedge of spheres of the
appropriate dimension. (The wedge sum of a collection of spaces is obtained by identifying a point of each;
for example, the wedge of n circles looks like a flower with n petals. Reduced homology is additive on wedge
sums, so by (6.1) the wedge sum of n copies of Sd has reduced homology Rn in dimension d, and 0 in other
dimensions.)
A Cohen-Macaulay complex ∆ is Gorenstein (over R) if in addition H̃dim ∆−dim σ−1 (link∆ (σ); R) ∼ = R for
all σ. That is, every link has the homology type of a sphere. This is very close to being a manifold. (I don’t
know offhand of a Gorenstein complex that is not a manifold, although I’m sure examples exist.)
A simplicial complex ∆ on vertex set E is a matroid complex if it is the family of independent sets of some
matroid M on E (see Defn. 3.4.1); in this case we write ∆ = I (M ). Many of the standard constructions of
matroid theory can be translated into simplicial complex language.
Say that a complex ∆ has property P hereditarily if every induced subcomplex ∆|X has property P;
for example, we have already seen that matroid complexes are hereditarily pure. (Note that the induced
subcomplex of ∆ on its entire vertex set is just itself, so if ∆ has P hereditarily then in particular it has P.)
Theorem 6.5.1. Let ∆ be an abstract simplicial complex on E. The following are equivalent:
Proof. Work on this The implications (2) =⇒ (3) =⇒ (4) are consequences of the material in Chapter 6
(the first is a homework problem and the second is easy).
(4) =⇒ (1): Suppose I, J are independent sets with |I| < |J|. Then the induced subcomplex ∆|I∪J is pure,
which means that I is not a maximal face of it. Therefore there is some x ∈ (I ∪ J) \ I = J \ I such that
I ∪ x ∈ ∆, establishing (I3).
123
(1) =⇒ (4): Let F ⊆ E. If I is a non-maximum face of ∆|F , then we can pick J to be a maximum face,
and then (I3) says that there is some x ∈ J such that I + x is a face of ∆, hence of ∆|F .
To be written
6.7 Exercises
Exercise 6.1. Let ∆ be a simplicial complex on vertex set V , and let v0 6∈ V . The cone over ∆ is the
simplicial complex C∆ generated by all faces σ + v0 for σ ∈ ∆.
Exercise 6.2. Let ∆ be a graph (that is, a 1-dimensional simplicial complex) with c components, v vertices,
and e edges. Determine the isomorphism types of the simplicial homology groups H̃0 (∆; R) and H̃1 (∆; R)
for any coefficient ring R.
Exercise 6.3. Construct two simplicial complexes with the same f -vector such that one is shellable and
one isn’t.
Exercise 6.4. Prove that the two conditions in the definition of shellability (Defn. 6.4.1) are equivalent.
Exercise 6.5. Prove Proposition 6.4.3.
Exercise 6.6. Prove that the link operation commutes with union and intersection of complexes. That is,
if X, Y are simplicial complexes that are subcomplexes of a larger complex X ∪ Y , and σ ∈ X ∪ Y , then
prove that
linkX∪Y (σ) = linkX (σ) ∪ linkY (σ) and linkX∩Y (σ) = linkX (σ) ∩ linkY (σ).
Exercise 6.7. (Requires some experience with homological algebra.) Prove that shellable simplicial com-
plexes are Cohen-Macaulay. (Hint: First do the previous problem. Then use a Mayer-Vietoris sequence.)
Exercise 6.8. Complete the proof of Theorem 6.5.1 by showing that hereditarily pure simplicial complexes
are shellable. (Hint: Pick a vertex v. Show that the two complexes
are both shellable. Then concatenate the shelling orders to produce a shelling order on ∆. You will probably
need Exercise 6.1.) As a consequence of the construction, derive a relationship among the h-polynomials of
∆, ∆1 , and ∆2 .
124
Exercise 6.9. Prove the Euler-Poincaré formula:
X
χ̃(∆) = (−1)k dimk H̃k (∆; k).
k≥−1
(Despite the appearance of homology, all you really need is the rank-nullity theorem from linear algebra.
The choice of ground field k is immaterial, but you can take it to be R if you want.)
Exercise 6.10. Express the h-vector of a matroid complex in terms of the Tutte polynomial of the underlying
matroid. (Hint: First figure out a deletion/contraction recurrence for the h-vector, using Exercise 6.8.)
Exercise 6.11. Let V = {x11 , x12 , . . . , xn 1, xn2 }. Consider the simplicial complex
∆n = {σ ⊆ V : σ 6⊆ {xi , yi } ∀i ∈ [n]}.
(In fact, ∆n is the boundary sphere of the crosspolytope, the convex hull of the standard basis vectors and
their negatives in Rn .) Determine the f - and h-vectors of ∆n .
∆(c1 , . . . , cn ) = {σ ⊆ V : |σ ∩ Vi | ≤ 1 ∀i ∈ [n]}.
(The previous problem is the case that ci = 2 for all i.) Show that ∆(c1 , . . . , cn ) is shellable. Determine its
f - and h-vectors.
125
Chapter 7
Polytopes are familiar objects such as cubes, pyramids, and Platonic solids. They are central in linear
programming and therefore in optimization, and exhibit a wealth of nice combinatorics. The classic book
on polytopes is Grünbaum [Grü03]; an equally valuable, more recent reference is Ziegler [Zie95]. A good
reference for the basics is chapter 2 of Schrijver’s notes [Sch13].
First some key terms. A subset S ⊆ Rn is convex if, for any two points in S, the line segment joining them
is also a subset of S. The smallest convex set containing a given set T is called its convex hull, denoted
conv(T ). Explicitly, one can show (exercise; not too hard) that
r
( )
X
conv(x1 , . . . , xr ) = c1 x1 + · · · + cr xr : 0 ≤ ci ≤ 1 for all i and ci = 1 . (7.1)
i=1
These points are called convex linear combinations of the xi . A related definition is the affine hull of
a point set, which is the smallest affine linear space containing it:
r
( )
X
aff(x1 , . . . , xr ) = c1 x1 + · · · + cr xr : ci = 1 . (7.2)
i=1
The interior of S as a subspace of its affine span is called the relative interior of S, denoted relint S.
This concept is necessary to talk about interiors of different-dimensional polyhedra in a sensible way. For
example, the closed line segment S = {(x, 0) : 0 ≤ x ≤ 1} in R2 has empty interior as a subset of R2 , but
its affine span is the x-axis, so relint S = {(x, 0) : 0 < x < 1}.
Clearly conv(T ) ⊆ aff(T ) (in fact, the inclusion is strict if 1 < |T | < ∞). For example, the convex hull of
three non-collinear points in Rn is a triangle, while their affine hull is the unique plane (i.e., affine 2-space)
containing that triangle.
Definition 7.1.1. A polyhedron P is a nonempty intersection of finitely many closed half-spaces in Rn .
Equivalently,
P = {x ∈ Rn : ai1 x1 + · · · + ain xn ≥ bi ∀i ∈ [m]}
where aij , bi ∈ R. These equations are often written as a single matrix equation Ax ≥ b, where A ∈ Rm×n
and b ∈ Rm .
126
Definition 7.1.2. A polytope in Rn is. . .
1. a bounded polyhedron; or
2. the convex hull of a finite set of points.
Theorem 7.1.3 (The Fundamental Theorem of Polytopes). The two definitions of “polytope” in Defini-
tion 7.1.2 are equivalent.
The only improper face is P itself. Note that the union of all proper faces is the topological boundary ∂P
(proof left as an exercise).
As a concrete example, suppose P is a polytope in R3 . What point or set of points is highest? In other
words, what points maximize the linear functional (x, y, z) 7→ z? The answer to this question might be a
single vertex, or an edge, or a polygonal face. Of course, there is nothing special about the z-direction. For
any direction given by a nonzero vector v, the extreme points of P in that direction are by definition the
maxima of the linear functional x 7→ x · v, and the set of those points is a face.
If you pick a linear functional “at random”, then with probability 1, the face it determines will be a vertex
of P . Higher-dimensional faces correspond to more special directions.
Proposition 7.1.6. Let P ⊆ Rn be a polyhedron. Then:
Proof. (1) Each face is defined by adding a single linear inequality to the list of inequalities defining P .
(2) Let F 0 , F 00 be two faces maximized by linear functionals `0 , `00 respectively, and suppose that F 0 ∩ F 00
contains a point x. Let F be the face maximized by the functiona; ` = `0 + `00 (in fact any positive linear
combination of `0 , `00 will do). Then x is a global maximum of ` on P , and since x also maximizes both `0 and
`00 , the face F consists exactly of those points of P maximizing both `0 and `00 . In other words, F = F 0 ∩ F 00 ,
as desired.
(3) By (2), the desired face Fx is the intersection of all faces containing x.
(4) If x ∈ ∂Fx then Fx has a face G containing x, but G is also a face of P by (1), which contradicts the
definition of Fx .
127
for all i — but then yi = x for all i by assumption. Therefore x is not in the convex hull of P \ {x}, hence
is a vertex.
On the other hand, if x is a point of P that is not an 0-dimensional face, then by (4) x ∈ relint Fx ,
hence x is a convex combination of two other points (such as x ± v for any vector v and some sufficiently
small ).
(6) The faces form a bounded poset under inclusion, and by (2) it is a meet-semilattice, hence a lattice by
Prop. 1.2.9. To prove that it is ranked. . . Insert proof
Sketch of proof of Theorem 7.1.3. Some details are left as an exercise. A full proof appears in [Sch13, §2.2].
First, let P be a intersection of finitely many half-spaces, i.e., P = {x ∈ Rn : Ax ≤ b}, where A ∈ Rm×n and
b ∈ Rm×1 . By projecting onto the orthogonal complement of the rowspace of A, we can assume WLOG that
rank A = n. For each point z ∈ P , let Az be the submatrix of A consisting of rows ai for which ai · z = bi .
One must show that
It follows that the vertices are all of the form A−1 R bR , where R is a row basis of A and AR , bR denote
restrictions. Not every point of this form necessarily liesin P , but this argument does show that there are
only finitely many vertices v1 , . . . , vk (specifically, k ≤ mn ). So far, this argument applies to any polyhedron.
In the next step, one must show that
P = conv{v1 , . . . , vk } (7.4)
using in addition the assumption that P is bounded.
Second, let P = conv(z1 , . . . , zr ) ⊆ Rn . Assume without loss of generality that aff(P ) = Rn (otherwise,
replace Rn with the affine hull) and that the origin is in the interior of P (translating if necessary). Define
P ∗ := {y ∈ Rn : x · y ≤ 1 ∀x ∈ P }. (7.5)
This is called the (polar) dual of P . One must show that in fact
P ∗ = {y ∈ Rn : zi · y ≤ 1 ∀i ∈ [r]} (7.6)
which means that P ∗ is an intersection of finitely many half-spaces. So, by the first part of the theorem,
P ∗ = conv(y1 , . . . , ys ) for some y1 , . . . , ys . Meanwhile, the double dual P ∗∗ = (P ∗ )∗ is defined by
P ∗∗ = {x ∈ Rn : x · y ≤ 1 ∀y ∈ P ∗ } = {x ∈ Rn : x · yj ≤ 1 ∀j ∈ [s]} (7.7)
where the second equality comes from applying the previous observation. Now one must show that
P = P ∗∗ (7.8)
128
If we number the rows R0 , . . . , R4 , every pair of rows other than {R0 , R2 } and {R1 , R3 } is of full rank. The
points corresponding to the other eight pairs of rows are:
Thus the vertices of P correspond to the bounding hyperplanes (i.e., lines) of P ∗ , and vice versa. J
1 1
P∗
1
−1 −1 1
P
−1
−1
• A facet of P is a face of codimension 1 (that is, dimension n − 1). In this case there is a unique linear
functional (up to scaling) that is maximized on F , given by the outward normal vector from P . Faces
of codimension 2 are called ridges and faces of codimension 3 are sometimes called peaks.
• A supporting hyperplane of P is a hyperplane that meets P in a nonempty face.
• P is simplicial if every face is a simplex. For example, every 2-dimensional polytope is simplicial, but
of the Platonic solids in R3 , only the tetrahedron, octahedron and icosahedron are simplicial — the cube
and dodecahedron are not. The boundary of a simplicial polytope is thus a simplicial (n − 1)-sphere.
• P is simple if every vertex belongs to exactly n faces. (In fact no vertex can belong to fewer than n
faces.)
Proposition 7.1.9. A polytope P is simple if and only if its dual P ∗ is simplicial. In this case the face
poset of P is the dual of the face poset of P ∗ .
One of the big questions about polytopes is to classify their possible f -vectors and, more generally, the
structure of their face posets. Here is a result of paramount importance.
Theorem 7.2.1. Let ∆ be the boundary sphere of a convex simplicial polytope P ⊆ Rd . Then ∆ is shellable,
and its h-vector is a palindrome, i.e., hi = hd−i for all i.
129
These equations are the Dehn-Sommerville relations. They were first proved early in the 20th century,
but the following proof, due to Bruggesser and Mani [BM71], is undoubtedly the one in the Book.
Sketch of proof. Let H be the collection of hyperplanes spanned by facets of P . Let ` be a line that passes
through the interior of P and meets each hyperplane in H in a distinct point. (Note that almost any line
will do.) Imagine walking along this line, starting just outside P so that only one facet is visible. Call that
facet F1 . As you continue to walk, more and more facets become visible. Label the facets F2 , . . . , Fm in
the order in which they appear (equivalently, order them in the order in which the line ` meets their affine
spans). When you get to infinity, come back the other way (so that all of a sudden “invisible” and “visible”
switch meanings) and continue to label the facets Fm+1 , . . . , Fn in the order in which they disappear.
F3 F3
F4
F1 F1
F5
F2 F2
Stage 1 Stage 2
(This assertion does need to be checked.) Moreover, each facet Fi contributes to hk (P ), where
On the other hand, the reversal of < is another instance of this construction, hence is also a shelling order.
Since each facet shares a ridge with exactly n other facets (because P is simplicial!), the previous formula
says that if facet F contributes to hi with respect to the first shelling order then it contributes to hn−i in
its reversal. Since the h-vector is an invariant of P , it follows that hi = hn−i for all i.
The Dehn-Sommerville relations are a basic tool in classifying h-vectors, and therefore f -vectors, of simplicial
polytopes. Since h0 = 1 for shellable complexes, it follows immediately that the only possible h-vectors for
simplicial polytopes in R2 and R3 are (1, k, 1) and (1, k, k, 1), respectively (where k is a positive integer),
and in particular the number of facets determines the h-vector (which is not the case in higher dimensions).
Recall from Definition 7.1.5 that a face of a polyhedron P ⊂ Rn is defined as the subset of P that maximizes
a linear functional. We can get a lot of mileage out of classifying linear functionals by which face of P they
maximize. The resulting structure N (P ) is called the normal fan of P . (Technical note: officially N (P )
is a structure on the dual space (Rn )∗ , but we typically identify (Rn )∗ with Rn by declaring the standard
basis to be orthonormal — equivalently, letting each vector in Rn act by the standard dot product.)
130
Given a face F ⊂ P , let σF be the collection of linear functionals maximized on F . As we will see, the sets
σF are in fact the interiors of cones (convex unions of rays from the origin).
Example 7.3.1. Let P = conv{(1, 1), (1, −1), (−1, 1)} ⊂ R2 . The polytope and its normal fan are shown
below.
σR
x R z
σz
Q σx
σQ
S
σy
y σS
P N (P )
The word “fan” means “collection of cones”. Multiplying a linear functional by a positive scalar does not
change the face on which it is maximized, and that if ` and `0 are linear functionals maximized on the same
face, then so is every functional a` + b`0 , where a, b are positive scalars. Therefore, each σF is a cone. The
vertices x, y, z correspond to the 2-dimensional cones, the edges Q, R, S to 1-dimensional cones (a.k.a. rays)
and the polytope P itself to the trivial cone consisting of the origin alone. In general, if F is a face of a
polytope P ⊆ Rn , then
dim σF = n − dim F. (7.9)
J
Example 7.3.2 (The normal fan of an unbounded polyhedron). Let P be the unbounded polyhedron defined
by the inequalities x ≤ 1, y ≤ 1, x + y ≤ 1 (so its vertices are x = (0, 1) and y = (1, 0)). The polytope and
its normal fan are shown below.
Q y
σQ
R
σR
x
σy
σx
S σS
P N (P )
131
This normal fan is incomplete: it does not cover every linear functional in (R2 )∗ , only the ones that have a
well-defined maximum on P (in this case, those in the first quadrant). It is not hard to see that the normal
fan of a polyhedron is complete if and only if the polyhedron is bounded, i.e., a polytope. The dimension
formula for normal cones (7.9) is still valid in the unbounded case. J
In general the normal fan of a polytope can be quite complicated, and there exist fans in Rn that are not the
normal fans of any polytope, even for n = 3; see, e.g., [Zie95, Example 7.5]. However, for some polytopes,
we can describe the normal fan using other combinatorics, such as the following important class.
Definition 7.3.3. A polytope P ⊆ Rn is a generalized permutahedron if its normal fan is a coarsening
of the braid fan (i.e., the fan of faces of the braid arrangement). Equivalently, for every linear functional
`(x) = a1 x1 + · · · + an xn the face of P maximized by ` is determined solely by the equalities and inequalities
among the coefficients ai .
The theory of generalized permutahedra is usually considered to have started with Postnikov’s paper [Pos09];
other important sources include [PRW08] and [AA17]. Edmonds considered equivalent objects earlier under
the name “polymatroids” (insert citations).
Theorem 7.3.4. A polytope P ⊆ Rn is a generalized permutahedron if and only if every edge of P is parallel
to ei − ej for some i, j, where {e1 , . . . , en } is the standard basis.
Generalized permutahedra can also be described as certain degenerations of the standard permutahedron,
which is the convex hull of the vectors (w1 , . . . , wn ), where w ranges over all permutations of [n]. The normal
fan of the standard permutahedron is precisely the braid fan.
One important family of generalized permutahedra are matroid base polytopes. Given a matroid M on
ground set [n], let P be the convex hull of all characteristic vectors of bases of M . It turns out that P is
a generalized permutahedron; in fact, the matroid base polytopes are exactly the generalized permutahedra
whose vertices are 0/1 vectors [GGMS87, Thm. 4.1]. Describing the faces of matroid polytopes in terms of
the combinatorics of the matroid is an interesting and difficult problem; see [FS05].
The central problem considered in this section is the following: How many integer or rational points
are in a convex polytope?
Definition 7.4.1. A polytope P ⊆ RN is integral (resp. rational) if and only if all vertices of P have
integer (resp. rational) coordinates.
For a set P ⊆ RN and a positive integer n, let nP = {nx : x ∈ P }. (nP is called a dilation of P .)
The (relative) boundary of P , written ∂P , is the union of proper faces of P , that is, the set of points
x ∈ P such that for every ε > 0, the ball of radius ε (its intersection with aff(P )) contains both points of P
and points not in P . The (relative) interior of P , int P , is P \ ∂P .
i(P, n) = |nP ∩ ZN |
i∗ (P, n) = |n(int P ) ∩ ZN |
132
i(P,
a n)a is the number of integer points in nP or, equivalently, the number of rational points in P of the form
0 1 aN
, ,..., . Our goal is to understand the functions i(P, n) and i∗ (P, n).
n n n
We start with P a simplex, and with an easy example. Let
P = conv{(0, 0, 0), (1, 1, 0), (1, 0, 1), (0, 1, 1)} ∈ R3 .
Then
nP = conv{(0, 0, 0), (n, n, 0), (n, 0, n), (0, n, n)}.
Each
P point in nP can be written as β1 (n, n, 0) + β2 (n, 0, n) + β3 (0, n, n) + β4 (0, 0, 0), with P 0 ≤ βi ≤ 1 and
βi = 1; or, alternatively, as α1 (1, 1, 0) + α2 (1, 0, 1) + α3 (0, 1, 1), with 0 ≤ αi ≤ n and αi ≤ n.
Case 1. If the αi are all integers, the resulting points are integer points and the sum of the coordinates is
even. How many such points are there? The answer is the number of monomials in four variables of degree
n, that is, n+3
3 . However, there are other integer points in nP .
Case 2. We can allow the fractional part of αi to be 1/2. If any one of the αi has fractional part 1/2, the
others must be also. Writing γi = αi − 1/2, we get points of the form
(γ1 + 1/2)(1, 1, 0) + (γ2 + 1/2)(1, 0, 1) + (γ3 + 1/2)(0, 1, 1)
= γ1 (1, 1, 0) + γ2 (1, 0, 1) + γ3 (0, 1, 1) + (1, 1, 1).
P P P
Note here that γi = ( αi ) − 3/2 ≤ n − 3/2. Since the γi are integers, γi ≤ n − 2. So the number of
these points equals the number of monomials in four variables of degree n − 2, that is, n+1 3 .
Note that all the points in Case 2 are interior points because each αi = γi + 1/2 > 0 and their sum P is at most
n − 2 + 3/2 (less than n). A point in Case 1 is an Pinterior point if and only if all the αi > 0 and αi < n.
The four-tuples (α1 − 1, α2 − 1, α3 − 1, n − 1 − α i ) correspond to monomials in four variables of degre
n − 4; there are n−1
3 of them. Thus we get
n−1
n+1 1 5
i∗ (P, n) = + = n3 − n2 + n − 1,
3 3 3 3
another polynomial. (Anything else you notice? Is it a coincidence?)
133
We can represent all points in the cone in terms of the vertices of P .
Proposition 7.4.2. Let P be a rational N -simplex in RN , with vertices v0 , v1 , . . . , vN , and let C = C(P̃ ).
PN
A point z ∈ RN +1 is a rational point in C if and only if z = i=0 ci (vi , 1) for some nonnegative rational
numbers ci . Furthermore, this representation of z is unique.
So to count integer points in C (and hence to determine i(P, n)), we only need to know how many integer
points are in Q with each fixed (integer) last coordinate. We call the last coordinate of z ∈ Q the degree of
PN
z. Note that for z ∈ Q, deg z = i=0 ri for some ri , 0 ≤ ri < 1, so if deg z is an integer, 0 ≤ deg z ≤ N .
Theorem 7.4.4. Let P be an integral N -simplex in RN , with vertices v0 , v1 , . . . , vN , let C = C(P̃ ), and
PN
let Q = { i=0 ri (vi , 1) : for each i, 0 ≤ ri < 1}. Let δj be the number of points of degree j in Q ∩ ZN +1 .
Then
∞
X δ0 + δ1 λ + · · · + δN λN
i(P, n)λn = .
n=0
(1 − λ)N +1
Proof.
∞
X
i(P, n)λn = (δ0 + δ1 λ + · · · + δN λN )(1 + λ + λ2 + · · · )N +1
n=0
∞ !
X k+N
= (δ0 + δ1 λ + · · · + δN λN ) λk .
N
k−0
PN n−j+N
The coefficient of λn on the right hand side is
j=0 δj N .
For the interior of P (and of C) we use an analogous construction, but with the opposite half-open paral-
lelipiped. Let (N )
X
∗
Q = ri (vi , 1) : 0 < ri ≤ 1∀i .
i=0
Proposition 7.4.6. Let P be an integral N -simplex in RN , with vertices v0 , v1 , . . . , vN , and let C = C(P̃ ).
PN
A point z ∈ ZN +1 is an integer point in int C if and only if z = y + i=0 ci (v1 , 1) for some y ∈ Q∗ ∩ ZN +1
and some nonnegative integers ci . Furthermore, this representation of z is unique.
So to count integer points in int C (and hence to determine i∗ (P, n)), we only need to know how many integer
points are in Q∗ with each fixed (integer) last coordinate. Note that for z ∈ Q∗ , 0 < deg z ≤ N + 1.
134
Theorem 7.4.7. Let P be an integral N -simplex in RN , with vertices v0 , v1 , . . . , vN , let C = C(P̃ ), and
PN
let Q∗ = { i=0 ri (vi , 1) : for each i, 0 < ri ≤ 1}. Let δj∗ be the number of points of degree j in Q∗ ∩ ZN +1 .
Then
∞
X δ1∗ λ + δ2∗ λ2 + · · · + δN
∗
+1 λ
N +1
i∗ (P, n)λn = .
n=0
(1 − λ)N +1
Now the punchline is that there is an easy relationship between the δi and the δi∗ . Note that
(N )
X
∗
Q = ri (vi , 1) : for each i, 0 < ri ≤ 1
i=0
(N )
X
= (1 − ti )(vi , 1) : for each i, 0 ≤ ti < 1
i=0
(N N
)
X X
= (vi , 1) − ti (vi , 1) : for each i, 0 ≤ ti < 1
i=0 i=0
N N
!
X X
= (vi , 1) − Q = vi , N + 1 −Q
i=0 i=0
∞
X δN λ + δN −1 λ2 + · · · + δ0 λN +1
F ∗ (P, λ) := i∗ (P, n)λn = .
n=0
(1 − λ)N +1
Thus
F ∗ (P, λ) = (−1)N +1 F (P, 1/λ).
So far I have considered only integral simplices. To extend the result to integral polytopes requires triangu-
lation of the polytope, that is, subdivision of the polytope into simplices. The extension is nontrivial. We
cannot just add up the functions i and i∗ for the simplices in the triangulation, since interior points of the
polytope can be contained in the boundary of a simplex of the triangulation, and in fact in the boundary of
more than one simplex of the triangulation. But it works in the end.
Theorem 7.4.10. Let P ⊆ RN be an integral polytope of dimension N . Then
∞
X
(1 − λ)N +1 i(P, n)λn
i=0
135
PN j
As before, write this polynomial as j=0 δj λ . What can we say about the coefficients δj ?
δ0 = i(P, 0) = 1, since this is the number of integer points in the polytope 0P = {0}.
I claim C is the volume of P . To see this, note that vol(nP ) = nN vol(P ) (if P is of full dimension N ). Now
the volume of nP can be estimated by the number of lattice points in nP , that is, by i(P, n). In fact,
i(P, n) − vol(nP ) i(P, n)
0 = lim = lim − vol(P ).
n→∞ nN n→∞ nN
i(P, n)
So C = lim = vol(P ).
n→∞ nN
One last comment. The Ehrhart theory can be generalized to rational polytopes. In the more general
case, the functions i(P, n) and i∗ (P, n) need not be polynomials, but are quasipolynomials—restricted to
a congruence class in some modulus (depending on the denominators occurring in the coordinates of the
vertices) they are polynomials. An equivalent description is that the function i(P, n) is a polynomial in n
and expressions of the form gcd(n, k), e.g.,
(
(n + 1)2 n even
i(P, n) = = (n + gcd(n, 2) − 1)2 .
n2 n odd
7.5 Exercises
Exercise 7.1. Prove that the topological boundary of a polyhedron is the union of its proper faces.
Exercise 7.2. Prove that the convex hull of a finite point set X = {x1 , . . . , xn } is the set of convex linear
combinations of it.
Exercise 7.3. Fill in the details in the proof of Theorem 7.1.3 by proving all the assertions of the form
“One must show”, i.e., (7.3), (7.4), (7.6), and (7.8). (You may use Minkowski’s Hyperplane Separation
Theorem, which states that if S ⊆ Rn is a convex set and y 6∈ S, then there exists a hyperplane separating
S from y — or equivalently a linear functional ` : Rn → R such that `(y) > 0 and `(x) < 0 for all x ∈ S.)
Exercise 7.4. Prove Theorem 7.3.4.
Exercise 7.5. Let M be a matroid and let P be its base polytope. Prove that P is a generalized permuta-
hedron in two different ways:
1. Show that the normal fan NP coarsens the braid cone, using the properties of greedy algorithms.
2. Show that every edge of P is parallel to some ei − ej , using the properties of basis exchange.
Exercise 7.6. Consider the simplex ∆n defined as the convex hull of the origin together with the n-basis
vectors in Rn . That is,
∆n = {x = (x1 , . . . , xn ) ∈ Rn : 0 ≤ xi ≤ 1, 0 ≤ x1 + · · · + xn ≤ 1}.
Calculate the Ehrhart polynomials i(∆n , k) and i∗ (∆n , k).
136
Exercise 7.7. The crosspolytope is defined as
137
Chapter 8
Group Representations
(That’s matrix multiplication on the left side of the equation, and group multiplication in G on the right.)
The number n is called the dimension (sometimes degree) of the representation, written dim ρ.
Some remarks:
• ρ specifies an action of G on V that respects its vector space structure. So we have all the accoutrements
of group actions, such as orbits and stabilizers. If there is only one representation under consideration,
it is often convenient to use group-action notation and write gv instead of the bulkier ρ(g)v.
• It is common to say that ρ is a representation, or that V is a representation, or that the pair (ρ, V ) is
a representation.
• ρ is faithful if it is injective as a group homomorphism.
Example 8.1.2. Let G be any group. The trivial representation is the map
P 8.1.3. Let kG be the vector space of formal k-linear combinations of elements of G: that is,
Example
kG = h∈G ah h : ah ∈ k . The regular representation of G is the map ρreg : G → GL(kG) defined by
!
X X
g ah h = ah (gh).
h∈G h∈G
That is, g permutes the standard basis vectors of kG according to the group multiplication law. Thus
dim ρreg = |G|. J
138
The vector space kG is a ring, with multiplication given by multiplication in G and extended k-linearly. In
this context it is called the group algebra of G over k.
Remark 8.1.4. A representation of G is equivalent to a (left) module over the group algebra kG. Tech-
nically “representation” refers to the way G acts and “module” refers to the space on which it acts, but the
two terms really carry the same information.
Example 8.1.5. Let G = Sn , the symmetric group on n elements. The defining representation ρdef
of G on kn maps each permutation σ ∈ G to the n × n permutation matrix with 1’s in the positions (i, σ(i))
for every i ∈ [n], and 0’s elsewhere. J
Example 8.1.6. More generally, let G act on a finite set X. Then there is an associated permutation
representation on kX , the vector space with basis X, given by
!
X X
g ax x = ax (gx).
x∈X x∈X
For short, we might specify the action of G on X and say that it “extends linearly” to kX . For instance, the
action of G on itself by left multiplication gives rise in this way to the regular representation, and the usual
action of Sn on an n-element set gives rise to the defining representation. J
Example 8.1.7. Let G = Z/nZ be the cyclic group of order n, and let ζ ∈ k be a nth root of unity (not
necessarily primitive). Then G has a 1-dimensional representation given by ρ(x) = ζ x . J
Example 8.1.8. Consider the dihedral group Dn of order 2n, i.e., the group of symmetries of a regular
n-gon, given in terms of generators and relations by
extended by group multiplication (e.g., ρgeo (sr2 ) = ρgeo (s)ρgeo (r)2 , etc.).
2. Regarding Dn as the symmetries of an n-gon gives permutation representations ρV and ρE on vertices
and edges respectively. These are both faithful n-dimensional representations. (Are they isomorphic
to each other? What does “isomorphic” mean in this context?)
3. The n-gon has n diameters (lines of reflection symmetry). The dihedral group acts on diameters and
thus gives rise to another n-dimensional permutation representation. This representation is faithful if
and only if n is odd. If n is even, then rn/2 acts by rotation by 180◦ and fixes all diameters. J
Example 8.1.9. The symmetric group Sn has another 1-dimensional representation, the sign represen-
tation, given by (
1 if σ is even,
ρsign (σ) =
−1 if σ is odd.
This representation is nontrivial provided char k 6= 2. Note that ρsign (g) = det ρdef (g) (see Example 8.1.5).
(More generally, if ρ is any representation, then det ρ is a 1-dimensional representation.) J
Example 8.1.10. Let (ρ, V ) and (ρ0 , V 0 ) be representations of G. The direct sum ρ⊕ρ0 : G → GL(V ⊕V 0 )
is defined by
(ρ ⊕ ρ0 )(g)(v + v 0 ) = ρ(g)(v) + ρ0 (g)(v 0 )
139
for v ∈ V , v 0 ∈ V 0 . In terms of matrices, (ρ ⊕ ρ0 )(g) is a block-diagonal matrix:
ρ(g) 0
.
0 ρ0 (g)
This construction looks superficially similar to Example 8.1.10 but really is different, hence the different
notation. For the most part, we will be focusing on representations of a single group at a time. J
When are two representations the same? More generally, what is a map between representations?
Example 8.2.2. Trivial(-ish) examples: The identity map V → V is an automorphism for any representation
V , and the zero map V → W is a homomorphism for any V, W . J
Example 8.2.3. Let G = Sn act on kn by the defining representation, and on k by the trivial representation.
The map kn → k given by
n n
!
X X
φ ai ei = ai
i=1 i=1
is G-equivariant because permuting the coordinates of a vector does not change their sum. J
Example 8.2.4. Let n be odd, and consider the dihedral group Dn acting on a regular n-gon (see Exam-
ple 8.1.8). Label the vertices 1, . . . , n in cyclic order. Label each edge the same as its opposite vertex, as in
the figure on the left. Then the permutation action ρV on vertices is identical to the action ρE on edges.
In other words, the diagram on the right commutes for all g ∈ Dn , where “opp” is the map that sends the
basis vector for a vertex to the basis vector for its opposite edge.
140
2
5 4
opp
kn kn
3 1
ρV (g) ρE (g)
1 3
kn kn
opp
4 2 5
The case that n is even is trickier, because then each reflection either fixes two vertices or two edges, but
not both. J
Example 8.2.5. Let v1 , . . . , vn benthe points of a regular n-gon in R2 centered at the origin, e.g., vj =
2πj 2πj
cos n , sin n . Then the map R → R sending the jth standard basis vector to vj is Dn -equivariant,
2
φ(eij ) = vi + vj
so that ρV ◦ φ(g) = φ ◦ ρE (g) for all g, i.e., the following diagram commutes:
φ
E /V
ρE (g) ρV (g)
φ
E /V
Kernels and images of G-equivariant maps are well-behaved. (Those familiar with modules will not be
surprised: every kernel or image of a R-module homomorphism is also a R-module.)
Proposition 8.2.7. Any G-equivariant map φ : (ρ, V ) → (ρ0 , V 0 )has G-invariant kernel and G-invariant
image.
141
On the other hand, the representation σ = ρtriv ⊕ ρsign on V (see Examples 8.1.2 and 8.1.9) is given by
1 0 1 0
σ(id) = , σ(flip) = .
0 1 0 −1
These two representations ρ and σ are in fact isomorphic. Indeed, ρ acts trivially on khe1 + e2 i and acts
by the sign representation on khe1 − e2 i. These two vectors form a basis of V (here is where we use the
assumption char k 6= 2), and one can check that the change-of-basis map
−1 " 1 1
#
1 1 2 2
φ= = 1
1 −1 − 21
2
If ρ = ρ0 , then any linear transformation φ : k → k (i.e., any map φ(v) = cv for some c ∈ k) will do. Thus
the set of G-equivariant homomorphisms is actually isomorphic to k.
Assume char k 6= 2 (otherwise ρtriv = ρsign and we are done at this point). If φ : ρtriv → ρsign is G-equivariant,
then we have diagrams
φ φ
V V0 V V0
ρtriv (12) ρsign (12) ρtriv (21) ρsign (21)
φ φ
V V0 V V0
The first diagram always commutes because ρtriv (12) = ρsign (12) is the identity map, but the second diagram
says that for every v ∈ k
and since char k 6= 2 we are forced to conclude that c = 0. Therefore, there is no nontrivial G-homomorphism
ρtriv → ρsign . J
Example 8.2.9 is the tip of an iceberg: we can use the vector space HomG (ρ, ρ0 ) of G-homomorphisms
φ : ρ → ρ0 to measure how similar ρ and ρ0 are.
142
• V is irreducible (or simple, or colloquially an irrep) if it has no proper G-invariant subspace.
• A representation that can be decomposed into a direct sum of irreps is called semisimple or com-
pletely reducible. A semisimple representation is determined up to isomorphism by the multiplicity
with which each isomorphism type of irrep appears.
Clearly, every representation can be written as the direct sum of indecomposable representations, and every
irreducible representation is indecomposable. On the other hand, there exist indecomposable representations
that are not irreducible.
Example 8.3.2. As in Example 8.2.8, let V = {e1 , e2 } be the standard basis for k2 , where char k 6= 2.
Recall that the defining representation of S2 = {id, flip} is given by
1 0 0 1
ρdef (id) = , ρdef (flip) =
0 1 1 0
Fortunately, we can rule out this kind of pathology most of the time.
Theorem 8.3.3 (Maschke’s Theorem). Let G be a finite group, let k be a field whose characteristic does not
divide |G|, and let (ρ, V ) be a representation of G over k. Then every G-invariant subspace has a G-invariant
complement. In particular, (ρ, V ) is semisimple.
Proof. If ρ is irreducible, then there is nothing to prove. Otherwise, let W be a G-invariant subspace, and
let π : V → W be a projection, i.e., a linear surjection that fixes the elements of W pointwise. (Such a
map π can be constructed as follows: choose a basis for W , extend it to a basis for V , and let π fix all the
basis elements in W and kill all the ones in V \ W .)
The map π is k-linear, but not necessarily G-equivariant. However, we can turn π into a G-equivariant
projection by “averaging over G”. (This trick will come up again and again.) Define π̃ : V → W by
1 X
π̃(v) = gπ(g −1 v). (8.2)
|G|
g∈G
143
Claim 2: π̃ is G-equivariant. Indeed,
1 X
π̃(hv) = gπ(g −1 hv)
|G|
g∈G
1 X
= (hk)π((hk)−1 hv)
|G|
k∈G: hk=g
1 X
= h kπ(k −1 v) = hπ̃(v).
|G|
k∈G
Maschke’s Theorem implies that, when the conditions hold, a representation ρ is determined up to iso-
morphism by the multiplicity of each irreducible representation in ρ (i.e., the number of isomorphic copies
appearing as direct summands of ρ). Accordingly, to understand representations of G, we should first study
irreps.
Example 8.3.4. Let k have characteristic 0 (for simplicity), and G = Sn . The defining representation of G
on kn is not simple, because it has an invariant subspace, namely the span of the all-1’s vector, a 1-dimensional
subspace L that is fixed pointwise by every σ ∈ Sn and therefore carries the trivial representations.1 By
Maschke’s theorem, L has a G-invariant complement. In fact, L⊥ is the orthogonal complement of L under
the standard inner product on kn , namely the space of all vectors whose coordinates sum to 0. This is called
(a little confusingly) the standard representation of Sn , denoted ρstd . That is,
Thus dim ρstd = n − 1. We will soon be able to prove that ρstd is irreducible (Exercise 8.2). J
8.4 Characters
The first miracle of representation theory is that we can detect the isomorphism type of a representation ρ
without knowing every coordinate of every matrix ρ(g): it turns out that all we need to know is the traces
of the ρ(g).
Definition 8.4.1. Let (ρ, V ) be a representation of G over k. Its character is the function χρ : G → k
given by
χρ (g) = tr ρ(g).
Note that characters are in general not group homomorphisms.
Example 8.4.2. Some simple facts and some characters we’ve seen before:
• A one-dimensional representation is its own character. (In fact these are exactly the characters that
are homomorphisms.)
• For any representation ρ, we have χρ (IdG ) = dim ρ, because ρ(IdG ) is the n × n identity matrix.
1 For the same reason, every permutation representation of every group has a trivial summand.
144
• The defining representation ρdef of Sn has character
J
Example 8.4.3. Consider the geometric representation ρgeo of the dihedral group Dn = hr, s : rn = s2 =
0, srs = r−1 i by rotations and reflections:
1 0 cos θ sin θ
ρgeo (s) = , ρgeo (r) =
0 −1 − sin θ cos θ
On the other hand, if ρ0 is the n-dimensional permutation representation on the vertices, then
n if g = 1,
0 if g is a nontrivial rotation,
χρ0 (g) = 1 if n is odd and g is a reflection,
0 if n is even and g is a reflection through two edges,
2 if n is even and g is a reflection through two vertices.
J
Proposition 8.4.4. Characters are class functions; that is, they are constant on conjugacy classes of G.
Moreover, if ρ ∼
= ρ0 , then χρ = χρ0 .
Now, let φ : ρ → ρ0 be an isomorphism represented by an invertible matrix Φ; then Φρ(g) = ρ0 (g)Φ for all
g ∈ G and so Φρ(g)Φ−1 = ρ0 (g). Taking traces gives χρ = χρ0 .
Surprisingly, it will turns out that the converse of the second assertion is also true: a representation is
determined up to isomorphism by its character! In fact, much, much more is true.
145
8.5 New representations and characters from old
The basic vector space functors of direct sum, duality, tensor product and Hom carry over naturally to
representations, and behave well on their characters. Throughout this section, let (ρ, V ) and (ρ0 , W ) be
finite-dimensional representations of G over C, with V ∩ W = 0.
1. Direct sum. To construct a basis for V ⊕ W , we can take the union of a basis for V and a basis for W .
Equivalently, we can write the vectors in V ⊕ W as column block vectors:
v 0
V ⊕W = : v ∈ V, v ∈ W .
v0
(hφ)(v) = φ(h−1 v)
Proof. Let J be the Jordan canonical form of ρ(h) (which exists since we are working over C), so that
χρ(h) = tr J. The diagonal entries Jii are its eigenvalues, which must be roots of unity since h has finite
order, so their inverses are their complex conjugates. Meanwhile, J −1 is an upper-triangular matrix with
(J −1 )ii = (Jii )−1 = Jii , and tr J −1 = χρ∗ (h) = χρ (h).
3. Tensor product. Fix bases {e1 , . . . , en } and {f1 , . . . , fm } for V and W respectively. As a vector space,
we define2
V ⊗ W = k hei ⊗ fj : 1 ≤ i ≤ n, 1 ≤ j ≤ mi ,
equipped with a multilinear action of k (that is, c(x ⊗ y) = cx ⊗ y = x ⊗ cy for c ∈ k). In particular,
dim(V ⊗ W ) = (dim V )(dim W ). We can accordingly define a representation (ρ ⊗ ρ0 , V ⊗ W ) by
or more concisely
h · (v ⊗ v 0 ) = hv ⊗ hv 0
extended bilinearly to all of V ⊗ W .
2 The “official” definition of the tensor product is much more functorial and can be made basis-free, but this concrete
146
In terms of matrices, (ρ ⊗ ρ0 )(h) is represented by the nm × nm matrix ρ(h) ⊗ ρ0 (h), defined by
(ρ(h) ⊗ ρ0 (h))(i,j),(k,`) = (ρ(h))i,k (ρ0 (h))j,` .
To explain this alphabet soup, the left-hand side is the entry in the row corresponding to ei ⊗ fj and column
corresponding to ek ⊗ f` . In particular,
n
! m
X X X
χρ⊗ρ0 (h) = (ρ(h))i,i (ρ0 (h))j,j = (ρ(h))i,i (ρ0 (h))j,j = χρ (h)χρ0 (h). (8.4)
(i,j)∈[n]×[m] i=1 j=1
First, the vector space HomC (V, W ) admits a representation of G, in which h ∈ G acts on a linear transfor-
mation φ : V → W by sending it to the map h · φ defined by
(h · φ)(v) = h(φ(h−1 v)) = ρ0 (h) φ(ρ(h−1 )(v)) . (8.5)
for h ∈ G, φ ∈ HomC (V, W ), v ∈ V . It is straightforward to verify that this is a genuine group action, i.e.,
that (hh0 ) · φ = h · (h0 · φ).
Moreover, HomC (V, W ) = ∼ V ∗ ⊗ W as vector spaces and G-modules. To see this, suppose that dim V = n
and dim W = m; then the elements of V ∗ and W can be regarded as 1 × n and m × 1 matrices respectively
(the former acting on V , which consists of n × 1 matrices, by matrix multiplication). Then the previous
description of tensor product implies that V ∗ ⊗ W consists of m × n matrices, which correspond to elements
of HomC (V, W ). This isomorphism is G-equivariant by (8.5), so
χHom(ρ,ρ0 ) (h) = χρ∗ ⊗ρ0 (h) = χρ (h) χρ0 (h). (8.6)
What about HomG (V, W ). Evidently HomG (V, W ) ⊆ HomC (V, W ), but equality need not hold. For ex-
ample, if V and W are the trivial and sign representations of Sn (for n ≥ 2), then HomC (V, W ) ∼
= C but
HomG (V, W ) = 0. (See Example 8.2.9.)
The two Homs are related as follows. In general, when a group G acts on a vector space V , the subspace
of G-invariants is defined as
V G = {v ∈ V : hv = v ∀h ∈ G}.
This is the largest subspace of V that carries the trivial action.
Observe that a linear map φ : V → W is G-equivariant if and only if hφ = φ for all h ∈ G, where G acts on
HomC (V, W ) as above. (The proof of this fact is left to the reader; it is nearly immediate from the definition
of that action.) That is,
HomG (V, W ) = HomC (V, W )G . (8.7)
Moreover, G acts by the identity on HomG (ρ, ρ0 ), so its character is a constant function whose value is
dimC HomG (ρ, ρ0 ). We want to understand this quantity.
147
8.6 The Fundamental Theorem of Character Theory
From now on, we assume that k = C (though everything would be true over an algebraically closed field of
characteristic 0), unless otherwise specified.
Recall that a class function is a function χ : G → C that is constant on conjugacy classes of G. Define an
inner product on the vector space C`(G) of C-valued class functions by
1 X 1 X
hχ, ψiG = χ(h) ψ(h) = |C| χ(C) ψ(C) (8.8)
|G| |G|
h∈G C
where C runs over all conjugacy classes. Observe that h·, ·iG is a sesquilinear form (i.e., C-linear in the
second term and conjugate linear in the first). It is also non-degenerate, because the indicator functions of
conjugacy classes form an orthogonal basis for C`(G). Analysts might want to regard the inner product as
a convolution (with summation over G as a discrete analogue of integration).
Proposition 8.6.1. Let (ρ, V ) be a representation of G. Then
1 Xχ
dimC V G = ρ (h) = hχtriv , χρ iG .
|G|
h∈G
Proof. The second equality follows from the definition of the inner product. For the first equality, define a
linear map π : V → V by
1 X
π= ρ(h).
|G|
h∈G
G
Note that π(v) ∈ V for all v ∈ V , because
1 X 1 X
gπ(v) = ghv = ghv = π(v)
|G| |G|
h∈G gh∈G
That is, π is a projection from V → V . Choose a basis for V consisting of a basis for V G and extend it to
G
a basis for V . With respect to that basis, π can be represented by the block matrix
I ∗
0 0
so that
1 Xχ
dimC V G = tr(π) = ρ (h).
|G|
h∈G
By the way, we know by Maschke’s Theorem that V is semisimple, so we can decompose it as a direct sum
of irreps. Then V G is precisely the direct sum of the irreducible summands on which G acts trivially.
Example 8.6.2. Let G act on a set X, and let ρ bePthe corresponding permutation representation on the
space CX. For each orbit O ⊆ X, the vector vO = x∈O x is fixed by G. On the other hand, any vector
148
ax x fixed by G must have ax constant on each orbits. Therefore the vectors vO are a basis for V G ,
P
x∈X
and dim V G is the number of orbits. So Proposition 8.6.1 becomes
1 X
# orbits = # fixed points of h
|G|
h∈G
One intriguing observation is that this expression is symmetric in ρ and ρ0 , since hα, βiG = hβ, αiG in
general, but dimC HomG (ρ, ρ0 ) is real. (It is not algebraically obvious that HomG (ρ, ρ0 ) and HomG (ρ0 , ρ)
should have equal dimension.)
Proposition 8.6.4 (Schur’s Lemma). Let G be a group, and let (ρ, V ) and (ρ0 , V 0 ) be finite-dimensional
irreps of G over a field k (not necessarily of characteristic 0).
Proof. For (1), recall from Proposition 8.2.7 that ker φ and im φ are G-invariant subspaces. But since ρ, ρ0
are simple, there are not many possibilities. Either ker φ = 0 and im φ = W , when φ is an isomorphism.
Otherwise, ker φ = V or im φ = 0, either of which implies that φ = 0.
∼ V 0 . Actually, for simplicity, assume V = V 0 .
For (2), the “otherwise” case follows from (1), so suppose V =
Since k is algebraically closed, every G-equivariant map φ : V → V has an eigenvalue λ. Then φ − λI is
G-equivariant and singular, hence zero by (1). So φ = λI is multiplication by λ.
We can now prove the following omnibus theorem, which essentially reduces the study of representations of
finite groups to the study of characters.
Theorem 8.6.5 (Fundamental Theorem of Character Theory for Finite Groups). Let (ρ, V ) and
(ρ0 , V 0 ) be finite-dimensional representations of G over C.
149
2. If ρ1 , . . . , ρn are distinct irreducible representations and
n n
ρ⊕m
M M
ρ = (ρi ⊕ · · · ⊕ ρi ) = i
i
| {z }
i=1 mi i=1
then
n
X
hχρ , χρi iG = mi and hχρ , χρ iG = m2i .
i=1
and consequently
n
X
(dim ρi )2 = |G|. (8.10)
i=1
5. The irreducible characters (i.e., characters of irreps) form an orthonormal basis for C`(G). In partic-
ular, the number of irreducible characters equals the number of conjugacy classes of G.
Proof. For assertion (1), the equation (8.9) follows from part (2) of Schur’s Lemma together with Proposi-
tion 8.6.3. It follows that the characters of isomorphism classes of irreps are an orthonormal basis for some
subspace of the finite-dimensional space C`(G), so there can be only finitely many of them. (This result
continues to amaze me every time I think about it.)
Assertion (2) follows because the inner product is additive on direct sums. That is, every class function ψ
satisfies
hχρ⊕ρ0 , ψiG = hχρ + χρ0 , ψiG = hχρ , ψiG + hχρ0 , ψiG .
For (3), Maschke’s Theorem says that every complex representation ρ can be written as a direct sum of
irreducibles. Their multiplicities determine ρ up to isomorphism, and can be recovered from χρ by (2).
(Again, amazing.)
For (4), recall that χreg (IdG ) = |G| and χreg (g) = 0 for g 6= IdG . Therefore
1 X 1
χreg , ρi G
= χreg (g)ρi (g) = |G|ρi (IdG ) = dim ρi
|G| |G|
g∈G
For (5), the irreducible characters are orthonormal (hence linearly independent in C`(G)), by Schur’s Lemma
together with assertion (3). The trickier part is to show that they in fact span C`(G). Let Y be the subspace
of C`(G) spanned by the irreducible characters, and let
150
Let φ ∈ Z. For any representation (ρ, V ), define a map Tρ = Tρ,φ : V → V by
1 X
Tρ = φ(g) ρ(g)
|G|
g∈G
or equivalently
1 X
Tρ (v) = φ(g)gv
|G|
g∈G
(to parse this, note that φ(g) is a number). Our plan is to show that Tρ is the zero map (in disguise), then
deduce that φ = 0.
1 X
Tρ (hv) = φ(g) ghv
|G|
g∈G
1 X
= h φ(g) h−1 ghv
|G|
g∈G
1 X
= h φ(hkh−1 ) kv (setting k = h−1 gh, hkh−1 = g)
|G|
k∈G
1 X
= h φ(k) kv (because φ ∈ C`(G))
|G|
k∈G
= hTρ (v).
Claim 2: Tρ = 0 if ρ is irreducible. Now that we know that Tρ is G-equivariant, Schur’s Lemma implies that
it is multiplication by a scalar. On the other hand
1 X
tr(Tρ ) = φ(g)χρ (g) = hφ, χρ iG = 0
|G|
g∈G
because φ ∈ Z = Y ⊥ . Since Tρ has trace zero and is multiplication by a scalar, it is the zero map.3
Claim 3: Tρ = 0 in all cases. By Maschke’s Theorem, every complex representation ρ is semisimple, and the
definition of Tρ implies that it is additive on direct sums (that is, Tρ⊕ρ0 = Tρ + Tρ0 ), proving the claim.
This is an equation in the vector space of |G| × |G| matrices. Observe that the permutation matrices ρreg (g)
have disjoint supports (the only group element that maps h to k is kh−1 ), hence are linearly independent.
Therefore φ(g) = 0 for all g, so φ is the zero map.
We have now shown that Y has trivial orthogonal complement as a subspace of C`(G), so Y = C`(G),
completing the proof.
Ln Ln
irreps of G, and let α = i=1 ρ⊕a
Corollary 8.6.6. Let ρ1 , . . . , ρn be theP i
i
and β = i=1 ρ⊕b
i
i
be two rep-
n
resentations. Then dim HomG (α, β) = i=1 ai bi , and in particular HomG (α, β) is a direct sum of constant
maps between irreducible summands of α and β.
3 This inference would not be valid in positive characteristic!
151
The irreducible characters of G can thus be written as a square matrix X with columns corresponding to
conjugacy classes. (It is helpful to include the size of the conjugacy class in the table, for ease of computing
scalar products.) By orthonormality of characters, the matrix X is close to being unitary: XDX ∗ = I,
where the star denotes conjugate transpose and D is the diagonal matrix with entries |C|/|G|. It follows
that X ∗ X = D−1 , which is equivalent to the following statement:
Proposition 8.6.7. Let χ1 , . . . , χn be the irreducible characters of G and let C, C 0 be any two conjugacy
classes. Then
n
χi (C)χi (C 0 ) = |G| δC,C 0 .
X
i=1
|C|
This is a convenient tool because it says that different columns of the character table are orthogonal under
the usual scalar product (without having to correct for the size of conjugacy classes).
Theorem 8.6.5 provides the basic tools to calculate character tables. In general, the character table of a
finite group G with k conjugacy classes is a k × k table in which rows correspond to irreducible characters
χ1 . . . . , χk and columns to conjugacy classes. Part (1) of the Theorem says that the rows form an orthonormal
basis under the inner product on class functions, so computing a character table resembles a Gram-Schmidt
process. The hard part is coming up with enough representations whose characters span C`(G). Here are
some ways of generating them:
• Every group carries the trivial and regular characters, which are easy to write down. The regular
character contains at least one copy of every irreducible character.
• The symmetric group also has the sign and defining characters.
• Many groups come with natural permutation actions whose characters can be added to the mix.
• The operations of duality and tensor product can be used to come up with new characters. Duality
preserves irreducibility, but tensor product typically does not.
In the following examples, we will notate a character χ by a bracketed list of its values on conjugacy classes, in
the same order that they are listed in the table. Numerical subscripts will always be reserved for irreducible
characters.
Example 8.7.1. The group G = S3 has three conjugacy classes, determined by cycle shapes:
We already know two irreducible 1-dimensional characters of S3 , namely the trivial character and sign
characters. Also, we always have the regular character χreg = [6, 0, 0]. So we begin with the following
character table:
Size 1 3 2
Conj. class C111 C21 C3
χ1 = χtriv 1 1 1
χ2 = χsign 1 −1 1
χreg 6 0 0
152
Equation (8.10) says that S3 has three irreps, the squares of whose dimensions add up to |S3 | = 6. So we
are looking for one more irreducible character χother of dimension 2. By (e) of Theorem 8.6.5, we have
χreg = χtriv + χsign + 2χother
from it which is easy to obtain
χother = [2, 0, −1].
One can check that χother is irreducible by confirming that its scalar product with itself is 1. By the way,
the defining representation of S3 is χdef = χtriv ⊕ χother . J
Example 8.7.2. We calculate all the irreducible characters of S4 . There are five conjugacy classes, cor-
responding to the cycle-shapes 1111, 211, 22, 31, and 4. The squares of their dimensions must add up to
|S4 | = 24; the only list of five positive integers with that property is 1, 1, 2, 3, 3.
Size 1 6 3 8 6
Conj. class C1111 C211 C22 C31 C4
χ1 = χtriv 1 1 1 1 1
χ2 = χsign 1 −1 1 1 −1
χdef 4 2 0 1 0
χreg 24 0 0 0 0
Of course χtriv and χsign are irreducible, since they are 1-dimensional. On the other hand, χdef can’t be
irreducible because S4 doesn’t have a 4-dimensional irrep. Indeed,
hχdef , χdef iG = 2
which means that ρdef must be a direct sum of two distinct irreps. (If it were the direct sum of two copies of
the unique 2-dimensional irrep, then hχdef , χdef iG would be 4, not 2, by (ii) of Theorem 8.6.5.) We calculate
hχdef , χtriv iG = 1, χdef , χsign G
= 0.
Therefore χ3 = χdef − χtriv is an irreducible character.
The other irreducible character χ5 has dimension 2. We can calculate it from the regular character and the
other four irreducibles, because
χreg = (χ1 + χ2 ) + 3(χ3 + χ4 ) + 2χ5
and so
χreg − χ1 − χ2 − 3χ3 − 3χ4
χ5 = .
2
and so the complete character table of S4 is as follows.
Size 1 6 3 8 6
Conj. class C1111 C211 C22 C31 C4
χ1 1 1 1 1 1
χ2 1 −1 1 1 −1
χ3 3 1 −1 0 −1
χ4 3 −1 −1 0 1
χ5 2 0 2 −1 0
J
153
8.8 One-dimensional characters
A one-dimensional character of G is identical with the representation it comes from: a group homomorphism
G → C× . Since tensor product is multiplicative on dimension, it follows that the tensor product of two one-
dimensional characters is also one-dimensional.. In fact χ ⊗ χ0 (g) = χ(g)χ0 (g) (this is immediate from the
definition of tensor product) and χ ⊗ χ∗ = χtriv . So the set Ch(G) of characters, i.e., Ch(G) = Hom(G, C× ),
is an abelian group under tensor product (equivalently, pointwise multiplication), with identity χtriv .
Definition 8.8.1. The commutator of two elements a, b ∈ G is the element [a, b] = aba−1 b−1 . The (normal)
subgroup of G generated by all commutators is called the commutator subgroup, denoted [G, G]. The
quotient Gab = G/[G, G] is the abelianization of G.
The abelianization can be regarded as the group obtained by forcing all elements G to commute, in addition
to whatever relations already exist in G; in other words, it is the largest abelian quotient of G. It is
routine to check that [G, G] is indeed normal in G, and also that χ([a, b]) = IdG for all χ ∈ Ch(G) and
a, b ∈ G. (In fact, this condition characterizes the elements of the commutator subgroup, as will be shown
soon.) Therefore, when studying one-dimensional characters on G, we may as well assume G is abelian (i.e.,
Ch(G) ∼ = Ch(Gab )).
Accordingly, let G be an abelian group of finite order n. The conjugacy classes of G are all singleton sets
(since ghg −1 = h for all g, h ∈ G), so there are n distinct irreducible representations of G. By (8.10) of
Theorem 8.6.5, so in fact every irreducible character is 1-dimensional (and every representation of G is a
direct sum of 1-dimensional representations). We have now reduced the problem to describing the group
homomorphisms G → C× .
The simplest case is that G = Z/nZ is cyclic. Write G multiplicatively, and let g be a generator. Then each
χ ∈ Ch(G) is determined by its value on g, which must be some nth root of unity. There are n possibilities
for χ(g), so all the irreducible characters of G arise in this way, and in fact form a group isomorphic to Z/nZ,
generated by any character that maps g to a primitive nth root of unity. So Hom(G, C× ) ∼ = G (although
this isomorphism is not canonical).
Now we consider the general case. Every abelian group G can be written as
r
G∼
Y
= Z/ni Z.
i=1
Let gi be a generator of the ith factor, and let ζi be a primitive (ni )th root of unity. Then each character χ
is determined by the numbers j1 , . . . , jr , where ji ∈ Z/ni Z and χ(gi ) = ζiji for all i. Thus Hom(G, C× ) ∼
= G,
an isomorphism known as Pontryagin duality. More generally, for any finite group G (not necessarily
abelian), there is an isomorphism
Hom(G, C× ) ∼ = Gab . (8.11)
This is quite useful when computing the character table of a group: if you can figure out the commutator
subgroup and/or the abelianization, then you can immediately write down the one-dimensional characters.
Sometimes the size of the abelianization can be determined from the size of the group and the number of
conjugacy classes. (The commutator subgroup is normal, so it itself is a union of conjugacy classes.)
The description of characters of abelian groups implies that if G is abelian and g 6= IdG , then χ(g) 6= 1 for
at least one character χ. Therefore, for every group G, we have
154
4
Example 8.8.2. Suppose that G is a group of order 24 with 8 conjugacy classes. P8 There is only one
possibility for the dimensions of the irreps (i.e., only one solution to the equation i=1 d2i = 24 in positive
integers), namely 1,1,1,1,1,1,3,3. In particular the abelianization must have size 6 and the commutator
subgroup must have size 24/6 = 4. There is only one abelian group of order 6, so we know the 1-dimensional
characters of Gab , and it should not be hard to pull back to the 1-dimensional characters of Gab , since the
quotient map G → Gab is constant on conjugacy classes.
If instead the group were known to have 6 conjugacy classes, then the equation has two solutions, namely
1,1,1,1,2,4 and 2,2,2,2,2,2, but the latter is impossible since every group has at least one 1-dimensional irrep,
namely the trivial representation. J
Example 8.8.3. Consider the case G = Sn . Certainly [Sn , Sn ] ⊆ An , and in fact equality holds. This is
trivial for n ≤ 2. If n ≤ 3, then the equation (a b)(b c)(a b)(b c) = (a b c) in Sn (multiplying left to right)
shows that [Sn , Sn ] contains every 3-cycle, and it is not hard to show that the 3-cycles generate the full
alternating group. Therefore (8.11) gives
Hom(Sn , C× ) ∼
= Sn /An ∼
= Z/2Z.
It follows that χtriv and χsign are the only one-dimensional characters of Sn . A more elementary way of
seeing this is that a one-dimensional character must map the conjugacy class of 2-cycles to either 1 or −1,
and the 2-cycles generate all of Sn , hence determine the character completely.
For instance, suppose we want to compute the character table of S5 (Exercise 8.6), which has seven conjugacy
classes. There are 21 lists of seven positive integers whose squares add up to |S5 | = 5! = 120, but only four
of them that contain exactly two 1’s:
1, 1, 2, 2, 2, 5, 9, 1, 1, 2, 2, 5, 6, 7, 1, 1, 2, 3, 4, 5, 8, 1, 1, 4, 4, 5, 5, 6.
By examining the defining representation and using the tensor product, you should be able to figure out
which one of these is the actual list of dimensions of irreps. J
Example 8.8.4. The dicyclic group G = Dic3 can be presented as
In particular it must have six irreps, four of dimension 1 and two of dimension 2. That’s a lot of 1’s, so
it is worth computing the commutator subgroup to get at the one-dimensional characters. It turns out
that [G, G] = {1, a2 , a4 }, and the quotient Gab is cyclic of order 4, generated by x. So the one-dimensional
characters are as follows:
Size 1 1 2 2 3 3
Conj. class C1 C2 C3 C4 C5 C6
χ1 1 1 1 1 1 1
χ2 1 1 1 1 −1 −1
χ3 1 −1 −1 1 i −i
χ4 1 −1 −1 1 −i i
The remaining two irreducible characters χ5 , χ6 evidently satisfy
χreg − χ1 − χ2 − χ3 − χ4 [12, 0, 0, 0, 0, 0] − [4, 0, 0, 4, 0, 0]
χ5 + χ6 = = = [4, 0, 0, −2, 0, 0].
2 2
4 According to Group Properties Wiki (11/1/20), there happens to be exactly one such group, namely A4 × Z2 .
155
Write them as
χ5 = [2, a, b, −1 + c, d, e], χ6 = [2, −a, −b, −1 − c, −d, −e].
Tensoring with any one-dimensional character has to either preserve or swap both χ5 and χ6 , and in particular
tensoring with χ2 = χ3 ⊗ χ3 must preserve it. But then d = −d and e = −e, and our lives have just gotten
somewhat easier, as we can write
hχ1 , χ5 iG = 2 + a + 2b + 2(−1 + c) = 0,
hχ3 , χ5 iG = 2 − a − 2b + 2(−1 + c) = 0.
Adding these equations gives 4 + 4(−1 + c) = 0, or c = 0; subtracting them gives a = −2b. At this point
we know that the C3 and C4 columns of the character table are (1, 1, −1, −1, b, −b) and (1, 1, 1, 1, −1, 1)
respectively. By Proposition 8.6.7 they are orthogonal, i.e., 2 − 2b = 0, or b = 1. So the final character table
is as follows:
Size 1 1 2 2 3 3
Conj. class C1 C2 C3 C4 C5 C6
χ1 1 1 1 1 1 1
χ2 1 1 1 1 −1 −1
χ3 1 −1 −1 1 i −i
χ4 1 −1 −1 1 −i i
χ5 2 −2 1 −1 0 0
χ6 2 2 −1 1 0 0
J
Let H ⊆ G be finite groups. Representations of G give rise to representations of H via an (easy) process
called restriction, and representations of H give rise to representations of G via a (somewhat more involved)
process called induction. These processes are sources of more characters to put in character tables, and the
two are related by an equation called Frobenius reciprocity.
Restricting a representation does not change its character on the level of group elements. On the other hand,
the restriction of an irreducible representation is not always irreducible. (Also, two elements conjugate in G
are not necessarily conjugate in a subgroup H.)
Example 8.9.1. Let Cλ denote the conjugacy class in Sn of permutations of cycle-shape λ. Recall that
the standard representation ρstd of G = S3 has character
and hχstd , χstd iG = 1 because ρstd is irreducible. On the other hand, let H = A3 ⊆ S3 . This is an
abelian group isomorphic to Z/3Z, so the two-dimensional representation Res(ρstd ) is not irreducible. In
156
fact H = C111 ∪ C3 , so hχstd , χstd iG = (1 · 22 + 2 · (−1)2 )/3 = 2. (We knew this already, since if a 2-
dimensional representation is not irreducible then it must be the direct sum of two one-dimensional irreps.)
The group A3 is cyclic, so its character table is
IdG (1 2 3) (1 3 2)
χtriv 1 1 1
χ1 (8.12)
1 ω ω2
χ2 1 ω2 ω
where ω = e2πi/3 . (Note also that the conjugacy class C3 ⊆ S3 splits into two singleton conjugacy classes in
A3 .) Now it is evident that Res(χstd ) = [2, −1, −1] = χ1 + χ2 . J
Let V be a direct sum of copies of W , one for each (left) coset of H in G. For the sake of bookkeeping, pick
a set of coset representatives B = {b1 , . . . , bn } ⊆ G, so that G = b1 H ∪· · · · ∪· bn H, so that we can think of V
as a sum of copies of W , indexed by B. (Here n = [G : H] = |G|/|H|.) We will write the elements of V as
tensors:
V = CB ⊗ W = (b1 ⊗ W ) ⊕ · · · ⊕ (bn ⊗ W ).
To say how a group element g ∈ G acts on the summand bi ⊗ W , we want to write gbi in the form bj h, where
j ∈ [n] and h ∈ H. We then make g act by
g(bi ⊗ w) = bj ⊗ ρ(h)(w), (8.13)
extended linearly to all of V . Heuristically, this formula is justified by the equation
g(bi ⊗ w) = gbi ⊗ w = bj h ⊗ w = bj ⊗ hw = bj ⊗ ρ(h)(w).
In other words, g sends bi ⊗ W to bj ⊗ W , acting by h along the way. Thus we have a map IndG H (ρ) that
sends each g ∈ G to the linear transformation V → V just defined. Alternative notations for IndG
H (ρ) include
Ind(ρ) (if G and H are clear from context) and ρ ↑G
H .
Example 8.9.2. Let G = S3 and H = A3 = {Id, (1 2 3), (1 3 2)}, and let (ρ, W ) be a representation of H,
where W = Che1 , . . . , en i. Let B = {b1 = Id, b2 = (1 2)}, so that V = b1 ⊗ W ⊕ b2 ⊗ W . To define the
induced representation, we need to solve the equations gbi = bj h. That is, for each g ∈ G and each bi ∈ B,
we need to determine the unique pair bj , h that satisfy the equation.
i=1 i=2
g gbi = bj h gbi = bj h
Id Id = b1 Id (1 2) = b2 Id
(1 2 3) (1 2 3) = b1 (1 2 3) (1 3) = b2 (1 3 2)
(1 3 2) (1 3 2) = b1 (1 3 2) (2 3) = b2 (1 2 3)
(1 2) (1 2) = b2 Id Id = b1 Id
(1 3) (1 3) = b2 (1 2 3) (1 2 3) = b1 (1 2 3)
(2 3) (2 3) = b2 (1 3 2) (1 3 2) = b1 (1 3 2)
157
Therefore, the representation IndG H (ρ) sends the elements of S3 to the following block matrices. Each block
is of size n × n; the first block corresponds to b1 ⊗ W and the second block to b2 ⊗ W .
ρ(Id) 0 ρ(1 2 3) 0 ρ(1 3 2) 0
Id 7→ (1 2 3) 7→ (1 3 2) 7→
0 ρ(Id) 0 ρ(1 3 2) 0 ρ(1 2 3)
0 ρ(Id) 0 ρ(1 2 3) 0 ρ(1 3 2)
(1 2) 7→ (1 3) 7→ (2 3) 7→
ρ(Id) 0 ρ(1 2 3) 0 ρ(1 3 2) 0
For instance, if ρ is the 1-dimensional representation (= character) χ1 of (8.12), then the character of Ind(ρ)
is given on conjugacy classes in S3 by
Thus IndG
H (ρ) is as follows:
ρ(Id) 0 0 0 ρ(Id) 0 0 0 ρ(Id)
Id 7→ 0 ρ(Id) 0 (1 3) 7→ ρ(Id) 0 0 (2 3) 7→ 0 ρ(1 2) 0
0 0 ρ(Id) 0 0 ρ(1 2) ρ(Id) 0 0
ρ(1 2) 0 0 0 ρ(1 2) 0 0 0 ρ(1 2)
(1 2) 7→ 0 0 ρ(1 2) (1 2 3) 7→ 0 0 ρ(Id) (1 3 2) 7→ ρ(1 2) 0 0
0 ρ(1 2) 0 ρ(1 2) 0 0 0 ρ(Id) 0
J
In fact, Ind(ρ) is a representation, and there is a general formula for its character. (That is a good thing,
because as you see computing the induced representation itself is a lot of work.)
Proposition 8.9.4. Let H be a subgroup of G and let (ρ, W ) be a representation of H with character χ.
Then IndG
H (ρ) is a representation of G, with character defined on g ∈ G by
1 X
IndG
H (χ)(g) = χ(k −1 gk).
|H|
k∈G: k−1 gk∈H
158
(We know that characters are class functions, so why not write χ(g) instead of χ(k −1 gk)? Because χ is a
function on H, so the former expression is not well-defined in general.)
Proof. First, we verify that Ind(ρ) is a representation. Let g, g 0 ∈ G and bi ⊗ w ∈ V . Then there is a unique
bk ∈ B and h ∈ H such that
gbi = bk h (8.15)
and in turn there is a unique b` ∈ B and h0 ∈ H such that
g 0 bk = b` h0 . (8.16)
We need to verify that g 0 · (g · (bi ⊗ w)) = (g 0 g) · (bi ⊗ w). Indeed,
g 0 · (g · (bi ⊗ w)) = g 0 · (bk ⊗ hw) = b` ⊗ h0 hw.
Now that we know that Ind(ρ) is a representation of G on V , we calculate its character using (8.14):
r
X X
Ind(χ)(g) = tr(Bi,i ) = χ(b−1
i gbi )
i=1 b−1
i∈[r]: i gbi ∈H
X 1 X
= χ(h−1 b−1
i gbi h)
|H|
i∈[r]: b−1
i gbi ∈H
h∈H
1 X
= χ(k −1 gk) (8.17)
|H|
k∈G: k−1 gk∈H
as desired. Here k = bi h runs over all elements of G as the indices of summation i, h on the previous sum
run over [r] and H respectively. (Also, k −1 gk = h−1 b−1 −1
i gbi h ∈ H if and only if bi gbi ∈ H, simply because
H is a group.) Since Ind(χ) is independent of the choice of B, so is the isomorphism type of Ind(ρ).
Corollary 8.9.5. Let H ⊆ G and let ρ be the trivial representation of H. Then
#{k ∈ G : k −1 gk ∈ H}
IndG
H (χtriv )(g) = .
|H|
Corollary 8.9.6. Suppose H is a normal subgroup of G. Then
|G|
(
G χ(g) if g ∈ H,
IndH (χ)(g) = |H|
0 otherwise.
Proof. Normality implies that k −1 gk ∈ H if and only if g ∈ H, independently of k. If g ∈ H then the sum
in (8.17) has |G| terms, all equal to χ(g); otherwise, the sum is empty. (Alternative proof: normality implies
that left cosets and right cosets coincide, so the blocks in the block matrix Ind(ρ)(g) will all be on the main
diagonal (and equal to ρ(g)) when g ∈ H, and off the main diagonal otherwise.)
159
Example 8.9.7. Let n ≥ 2, so that An is a normal subgroup of Sn of index 2. Either Corollary 8.9.5 or
Corollary 8.9.6 imply that (
Sn 2 for g ∈ A3 ,
IndAn (χtriv )(g) =
0 for g 6∈ A3 ,
which is the sum of the trivial and sign characters on Sn . J
Example 8.9.8. Let G = S4 and let H be the subgroup {id, (1 2), (3 4), (1 2)(3 4)}. Note that H is
not a normal subgroup of G. Let ρ be the trivial representation of G and χ its character. We can calculate
ψ = IndGH χ using Corollary 8.9.5:
where, as usual, Cλ denotes the conjugacy class in S4 of permutations with cycle-shape λ. In the notation
of Example 8.7.2, the decomposition into irreducible characters is χ1 + χ2 + 2χ5 . J
Theorem 8.9.9 (Frobenius Reciprocity). Let H ⊆ G be groups, let χ be a character of H, and let ψ be a
character of G. Then D E
IndG
H (χ), ψ = χ, ResGH (ψ) H .
G
Proof.
1 X
hInd(χ), ψiG = Ind(χ)(g) · ψ(g)
|G|
g∈G
1 X 1 X
= χ(k −1 gk) · ψ(g) (by Prop. 8.9.4)
|G| |H|
g∈G k∈G: k−1 gk∈H
1 XX X
= χ(h) · ψ(k −1 gk)
|G||H|
h∈H k∈G g∈G: k−1 gk=h
1 XX
= χ(h) · ψ(h) (i.e., g = khk −1 )
|G||H|
h∈H k∈G
1 X
= χ(h) · ψ(h) = hχ, Res(ψ)iH .
|H|
h∈H
But ψ is irreducible. Therefore, it must be the case that Ind(χ1 ) = ψ, and the corresponding representations
are isomorphic. The same is true if we replace χ1 with χ2 . J
160
8.10 Characters of the symmetric group
We have worked out the irreducible characters of S3 , S4 and S5 ad hoc (the last as an exercise). In fact,
we can do this for all n, exploiting a vast connection to the combinatorics of partitions and tableaux.
Recall (Defn. 1.2.4) that a partition of n is a sequence λ = (λ1 , . . . , λ` ) of weakly decreasing positive integers
whose sum is n. We’ll sometimes drop the parentheses and commas. We write λ ` n or |λ| = n to indicate
that λ is a partition of n. The number ` = `(λ) is the length of λ. The set of all partitions of n is Par(n),
and the number of partitions of n is p(n) = |Par(n)|. For example,
We’ll write Par for the set of all partitions. (As a set this is the same as Young’s lattice, which we used to
call Y .)
For each λ ` n, let Cλ be the conjugacy class in Sn consisting of all permutations with cycle shape λ. Since
the conjugacy classes are in bijection with Par(n), it makes sense to look for a set of representations indexed
by partitions.
Definition 8.10.1. Let µ = (µ1 , . . . , µ` ) ` n.
• The Ferrers diagram of shape µ is the top- and left-justified array of boxes with µi boxes in the ith
row.
• A (Young) tableau5 of shape µ is a Ferrers diagram with the numbers 1, 2, . . . , n placed in the
boxes, one number to a box.
• Two tableaux T, T 0 of shape µ are row-equivalent, written T ∼ T 0 , if the numbers in each row of T
are the same as the numbers in the corresponding row of T 0 .
• A (Young) tabloid of shape µ is an equivalence class of tableaux under row-equivalence. A tabloid
can be represented as a tableau without vertical lines.
• We write sh(T ) = µ to indicate that a tableau or tabloid T is of shape µ.
1 3 6 1 3 6
2 7 2 7
4 5 4 5
A Young tabloid can be regarded as an ordered set partition (T1 , . . . , Tm ) of [n] in which |Ti | = µi . The
order of the blocks Ti matters, but not the order of entries within each block. Thus the number of tabloids
of shape µ is
n n!
= .
µ µ1 ! · · · µm !
The symmetric group Sn acts on tabloids by permuting the numbers. This action gives rise to a permutation
representation (ρµ , Vµ ) of Sn , the µ-tabloid representation of Sn . Here Vµ is the vector space of all formal
C-linear combinations of tabloids of shape µ. The character of ρµ will be denoted τµ .
5 Terminology of tableaux is not consistent: some authors reserve the term “Young tableau” for a tableau in which the
numbers increase downward and leftward. In these notes, I will call such a tableau a “standard tableau”. For the moment, I
am not placing any restrictions on which numbers can go in which boxes: there are n! tableaux of shape µ for any µ ` n.
161
Example 8.10.2. For n = 3, the characters of the tabloid representations ρµ are as follows.
Conjugacy classes
C111 C21 C3
τ3 1 1 1 (8.18)
Characters τ21 3 1 0
τ111 6 0 0
|Cµ | 1 3 2
ρ(n) ∼
= ρtriv .
• The tabloids of shape µ = (1, 1, . . . , 1) are just the permutations of [n]. Therefore
ρ(1,1,...,1) ∼
= ρreg .
ρ(n−1,1) ∼
= ρdef .
In fact, all tabloid representations can be obtained from induction. Some notation first. For a partition
λ = (λ1 , . . . , λ` ) ` n, define
λ[i] = λ1 + · · · + λi ,
Li = [λ[i−1] + 1, λi ], (8.19)
Sλ = {σ ∈ Sn : σ(Li ) = Li ∀i},
so that Sλ ∼ = Sλ1 × · · · × Sλ` . This is called a Young subgroup. Note that Sλ is not a normal subgroup
of Sn (unless λ = (n) or λ = (1n )), since it is conjugate to any subgroup fixing each of some collection of
intervals of sizes λ1 , . . . , λ` .
Proposition 8.10.3. Let λ = (λ1 , . . . , λ` ) ` n. Then IndS ∼
Sλ (ρtriv ) = ρλ .
n
Proof. We will show that the characters of these representations are equal, i.e., that IndS
Sλ (χtriv ) = τλ .
n
Assign labels 1, . . . , n to the cells of the Ferrers diagram of λ reading from left to right and top to bottom,
so that the cells in the ith row are labeled by Li . For every w ∈ Sn , let Tλ,w be the tableau of shape λ in
which cell k is filled with the number w(k).
and let Mj be the support of the jth cycle of g, i.e., Mj = [µ[j−1] + 1, µ[j] ]. Then, by Corollary 8.9.5
162
(replacing w with w−1 for brevity)
1
IndS
Sλ (χtriv )(g) =
n
#{w ∈ Sn : wgw−1 ∈ Sλ }
λ1 ! · · · λ` !
1 n o
= # w ∈ Sn : w(1) · · · w(µ1 ) · · · w(n − µk + 1) · · · w(n) ∈ Sλ
λ1 ! · · · λ` !
1
= # {w ∈ Sn : ∀j ∃i : w(Mj ) ⊆ Li }
λ1 ! · · · λ` !
1
= # {tableaux Tλ,w with all elements of Mj in the same row}
λ1 ! · · · λ` !
= # {tabloids of shape λ with all elements of Mj in the same row}
= # {tabloids of shape λ fixed by g}
= τλ (g) = τλ (Cµ ).
For n = 3, the table in (8.18) is a triangular matrix. In particular, the characters τµ are linearly independent,
hence a basis, in the vector space of class functions on S3 . In fact, we will prove that this is the case for all
n. we first need to define two orders on the set Par(n).
Definition 8.10.4. Let λ, µ ∈ Par(n).
2. Dominance order on Par(n) is defined by λ C µ if λ 6= µ and λ[k] ≤ µ[k] for all k. In this case we say
µ dominates λ.
(5) > (4, 1) > (3, 2) > (3, 1, 1) > (2, 2, 1) > (2, 1, 1, 1) > (1, 1, 1, 1, 1).
(“Lex-greater partitions are short and wide; lex-smaller ones are tall and skinny.”)
Dominance is a partial order on Par(n). It first fails to be a total order for n = 6 (neither of 33 and 411
dominates the other). Lexicographic order is a linear extension of dominance order: if λ C µ then λ < µ.
Since the tabloid representations ρµ are permutation representations, we can calculate τµ by counting fixed
points. That is, for any permutation w ∈ Cλ , we have
Proposition 8.10.5. Let λ, µ ` n. Then τµ (Cµ ) 6= 0, and τµ (Cλ ) 6= 0 only if λ E µ (thus, only if λ ≤ µ in
lexicographic order).
Proof. First, let w ∈ Cµ . Take T to be any tabloid whose blocks are the cycles of w; then wT = T . For
example, if w = (1 3 6)(2 7)(4 5) ∈ S7 , then T can be either of the following two tabloids:
1 3 6 1 3 6
2 7 4 5
4 5 2 7
163
Q
It follows from (8.21) that τµ (Cµ ) 6= 0. (In fact τµ (Cµ ) = j rj !, where rj is the number of occurrences of j
in µ.)
For the second assertion, observe that w ∈ Sn fixes a tabloid T of shape µ if and only if every cycle of w
is contained in a row of T . This is possible only if, for every k, the largest k rows of T are collectively big
enough to hold the k largest cycles of w. This is precisely the condition λ E µ.
Proof. Make the characters into a p(n) × p(n) matrix X = [τµ (Cλ )]µ,λ`n with rows and columns ordered by
lex order on Par(n). By Proposition 8.10.5, X is a triangular matrix with nonzero entries on the diagonal,
so it is nonsingular.
We can transform the characters τµ into a list of irreducible characters χµ of Sn by applying the Gram-
Schmidt process with respect to the inner product h·, ·iSn . We’ll start with µ = (n) and work our way up in
lexicographic order. What will happen is that each tabloid character τµ will decompose as
X
τµ = χµ + Kλ,µ χλ ; (8.22)
λ<µ
the multiplicities Kλ,µ are called Kostka numbers and we will have more to say about them in the next
chapter. For the time being, we will only be able to observe (8.22) for particular examples, including S3
and S4 , but we will eventually be able to prove it in general (Corollary 9.12.3).
Example 8.10.7. We will use tabloid representations to derive the tables of irreducible characters for S3
and S4 .
Recall the table of characters (8.18) of the tabloid representations for G = S3 . Here is how the Gram-Schmidt
process goes.
Second, hτ21 , χ3 iG = 1. Thus τ21 − χ3 = [2, 0, −1] is orthogonal to χ3 , and in fact it is irreducible, so we
label it as χ21 . (This is χstd .)
X
K
z }| {
z }| { "C3 C21 C111
τ3 1 1 1 1 0 0 1 1 1 # χ3
τ21 = 3 1 0 = 1 1 0 2 0 −1 χ2 (8.23)
τ111 6 0 0 1 2 1 1 −1 1 χ111
164
where X is the character table. For S4 , the same procedure produces
X
K
z }| {
z
}| { C4 C31 C22 C211 C1111
τ4 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 χ4
τ31 4
2 0 1 0
1 1 0 0 0 3 1 −1 0 −1 χ3
τ22 = 6 2 2 0 0 = 1 1 1 0 0 2 0 2 −1 0 χ22 .
τ211 12 2 0 0 0 1 2 1 1 0 3 −1 −1 0 1 χ211
τ1111 24 0 0 0 0 1 3 2 3 1 1 −1 1 1 −1 χ1111
Note that the multiplicity of χλ in τλ is 1 in every case (i.e., that the matrix K has 1’s on the main
diagonal). J
At this point you should feel a bit dissatisfied, since I have not told you what the values of the irreducible
characters actually are in general, just that you can obtain them from the tabloid characters plus Gram-
Schmidt. That is a harder problem; the answer is given by the Murnaghan-Nakayama Rule (see §9.14),
which expresses the values of the irreducible characters as signed counts of certain tableaux.
Also, what are the multiplicities of the irreps in the tabloid representations, i.e., the numbers in the matrix
K? Note that S3 and S4 these matrices are unitriangular (1’s on the main diagonal and 0’s above), which
says that the tabloid characters are not just a vector basis for the class functions, but more strongly a basis for
the free abelian group generated by irreducible characters. We will prove this eventually (in Corollary 9.12.3),
by which point we will have a combinatorial description of the entries of K.
8.11 Exercises
In all exercises, unless otherwise specified, G is a finite group and (ρ, V ) and (ρ0 , V 0 ) are finite-dimensional
representations of G over C.
Exercise 8.1. Let χ be an irreducible character of G and let ψ be a one-dimensional character. Prove that
ω := χ ⊗ ψ is an irreducible character.
Exercise 8.2. Let n ≥ 2. Prove that the standard representation ρstd of Sn (see Example 8.3.4) is
irreducible. (Hint: Compute hχdef , χdef i and hχdef , χtriv i. The latter boils down to finding the expected
number of fixed points in a permutation selected uniformly at random; this is an old classic that uses what
is essentially linearity of expectation.)
Exercise 8.3. Let G be a group of order 63. Prove that G cannot have exactly 5 conjugacy classes. (You
are encouraged to use a computer for part of this problem.)
Exercise 8.4. Let X = {12|34, 13|24, 14|23} be the set of set partitions of [4] into two doubletons, and let
V = CX. The standard permutation action of S4 on {1, 2, 3, 4} induces an action on X. On the level of
representations, the defining representation ρdef induces a 3-dimensional representation (ρ, V ).
165
Exercise 8.6. Work out the character table of S5 without using any of the material in Section 8.10. (Hint:
To construct another irreducible character, start by considering the action of S5 on the edges of the complete
graph K5 induced by the usual permutation action on the vertices.)
Exercise 8.7. Work out the character table of the quaternion group Q; this is the group of order 8 whose
elements are {±1, ±i, ±j, ±k} with relations i2 = j 2 = k 2 = −1, ij = k, jk = i, ki = j.
Exercise 8.8. Work out the characters χλ of the Specht modules Spλ for all λ ` 4. (Start with the characters
of the tabloid representations, then do linear algebra. Feel free to use a computer algebra system to automate
the tedious parts.) Compare your result to the character table of S4 calculated ad hoc in Example 8.7.2.
Make as many observations or conjectures as you can about how the partition λ is related to the values of
the character χλ .
Exercise 8.9. Recall that the alternating group An consists of the n!/2 even permutations in Sn , that is,
those with an even number of even-length cycles.
(a) Show that the conjugacy classes in A4 are not simply the conjugacy classes in S4 . (Hint: Consider the
possibilities for the dimensions of the irreducible characters of A4 .)
(b) Determine the conjugacy classes in A4 , and the complete list of irreducible characters.
(c) Use this information to determine [A4 , A4 ] without actually computing any commutators.
166
Chapter 9
Symmetric Functions
The symmetric polynomials that are homogeneous of degree d form a finitely generated, free R-module
Λd (R). For example, if n = 3, then up to scalar multiplication, the only symmetric polynomial of degree 1
in x, y, z is x + y + z. In degree 2, here are two:
x2 + y 2 + z 2 , xy + xz + yz.
Every other symmetric polynomial that is homogeneous of degree 2 is a R-linear combination of these
two, because the coefficients of x2 and xy determine the coefficients of all other monomials. Similarly, the
polynomials
x3 + y 3 + z 3 , x2 y + xy 2 + x2 z + xz 2 + y 2 z + yz 2 , xyz
are a basis for the space of degree 3 symmetric polynomials in R[x, y, z].
Each member of this basis is a sum of the monomials in a single orbit under the action of S3 . Accordingly,
we can index them by the partition whose parts are the exponents of one of its monomials. That is,
m3 (x, y, z) = x3 + y 3 + z 3 ,
m21 (x, y, z) = x2 y + xy 2 + x2 z + xz 2 + y 2 z + yz 2 ,
m111 (x, y, z) = xyz.
But unfortunately, this is zero if ` > n. So n variables is not enough! In other words, we need a countably
infinite set of variables {x1 , x2 , . . . }, which means that we need to work not with polynomials, but with. . .
167
9.2 Formal power series
Let R be an integral domain (typically Z or a field), and let xQ= {x1 , x2 , . . . } be a countably infinite P set of
∞
commuting indeterminates. A monomial is a product xα = i=1 xα i
i
, where α i ∈ N for all i and i∈I αi
is finite (equivalently, all but finitely many of the αi are zero). The sequence α = (α1 , α2 , . . . ) is called the
exponent vector of the monomial; listing the nonzero entries of α in decreasing order gives a partition
λ(α). A formal power series is an expression
X
cα xα
α
with cα ∈ R for all α. Equivalently, a formal power series can be regarded as a function from monomials
to R, mapping xα to cα . We often use the notation
The set R[[x]] of all formal power series is an abelian group under addition, and in fact an R-module,
namely the direct product of countably infinitely many copies of R.1 In fact, R[[x]] is a ring as well, with
multiplication given by
!
X X X X
cα x α dβ x β = cα dβ xγ .
α∈NI β∈NI γ∈NI (α,β): α+β=γ
because the inner sum on the right-hand side has only finitely many terms for each γ, and is thus a well-
defined element of R.
We are generally not concerned with whether (or where) a formal power series converges in the sense of
calculus, since we do not often rarely need to plug in real values for the indeterminates xi (and when we
do, analytic convergence is not usually an issue). All that matters is that every operation must produce
a well-defined power series, in the sense that each coefficient is given by a finite computation in the base
ring R. For example, multiplication of power series satisfies this criterion, as explained above.2
Familiar functions from analysis (like exp and log) can be regarded as formal power series, namely their
Taylor series. However, we will typically study them using combinatorial rather than analytic methods. For
instance, from this point of view, we would justify equating the function 1/(1 − x) as equal to the power
series 1 + x + x2 + · · · not by calculating derivatives of 1/(1 − x), but rather by observing that the identity
(1−x)(1+x+x2 +· · · ) = 1 holds in Z[[x]]. (That said, combinatorics also gets a lot of mileage out of working
with derivative operators — but treating them formally, as linear transformations that map monomials to
other monomials, rather than analytically.) Very often, analytical identities among power series can be
proved using combinatorial methods; see Exercise 9.4 for an example.
We can now define symmetric functions properly, as elements of the ring of formal power series C[[x]] =
C[[x1 , x2 , . . . ]].
1 As opposed to the polynomial ring R[x], which is a direct sum of countably infinitely many copies of R.
2 We would have a problem with multiplication if we allowed two-way-infinite series. For example, the square of n∈Z xn is
P
not well-defined.
168
Definition 9.3.1. Let λ ` n. The monomial symmetric function mλ is the power series
X
mλ = xα .
α: λ(α)=λ
where
Each Λd is a finitely generated free R-module, with basis {mλ : λ ` d}, and their direct sum Λ is a graded
R-algebra. If we let S∞ be the group whose members
S∞ are the permutations of {x1 , x2 , . . . } with only finitely
many non-fixed points (equivalently, S∞ = n=1 Sn ), then Λ is the ring of formal power series that have
bounded degree and that are invariant under the action of S∞ .
The monomial symmetric functions are the most natural basis for Λ from an algebraic point of view, but
there are many other bases that arise more frequently in combinatorics. Understanding symmetric functions
requires familiarity with these various bases and how they interact.
One piece of terminology: we say that a basis B of Λ is an integral basis if the symmetric functions with
integer coefficients are precisely the integer linear combination of elements of B. Evidently, {mλ } is an
integral basis. This condition is stronger than being a vector space basis for Λ; for example, integral bases
are not preserved by scaling.
Definition 9.4.1. The kth elementary symmetric function ek is the sum of all squarefree monomials
of degree k. That is,
e0 = 1,
X Y X
ek = xs = xi1 xi2 · · · xik = m1k for k > 0,
S⊆N>0 s∈S 0<i1 <i2 <···<ik
|S|=k
169
Apparently {e3 , e21 , e111 } is an R-basis for Λ3 , because the transition matrix is unitriangular and therefore
invertible over every R. This works for n = 4 as well, where
e1111 24 12 6 4 1 m1111
e211 12 5 2 1 0 m211
e22 = 6 2 1 0 0 m22 .
e31 4 1 0 0 0 m31
e4 1 0 0 0 0 m4
This matrix is again unitriangular, and interestingly is symmetric across the northwest/southeast diagonal
— that is, the coefficient of eλ in mµ equals the coefficient of eµ in mλ . (Is that always true?)
## Input
n = 3
e = SymmetricFunctions(QQ).elementary()
m = SymmetricFunctions(QQ).monomial()
for lam in Partitions(n):
m(e[lam])
## Output
m[1, 1, 1]
3*m[1, 1, 1] + m[2, 1]
6*m[1, 1, 1] + 3*m[2, 1] + m[3]
or even
## Input
n = 4
Matrix([ [m(e[lam]).coefficient(mu) for mu in Partitions(n)] for lam in Partitions(n) ])
## Output
[0, 0, 0, 0, 1]
[0, 0, 0, 1, 4]
[0, 0, 1, 2, 6]
[0, 1, 2, 5, 12]
[1, 4, 6, 12, 24]
Let D denote the dominance partial order on partitions (see Definition 8.10.4). Also, for a partition λ, let λ̃
be its conjugate, given by transposing the Ferrers diagram (see the discussion after Example 1.2.4).
Theorem 9.4.2. Let λ, µ ` n, with ` = `(λ) and k = `(µ). Let bλ,µ be the coefficient of eλ when expanded
in the monomial basis, that is, X
eλ = bλ,µ mµ .
µ
Then bλ,λ̃ = 1, and bλ,µ = 0 unless λ̃ D µ. In particular, {eλ : λ ` n} is an integral basis for Λn .
170
Represent such a λ-factorization of xµ into by a tableau T of shape λ in which the ith row contains the
variables in xαi , in increasing order. For example, suppose that µ = (3, 2, 2, 1, 1) and λ = (4, 2, 2, 1). One
λ-factorization of xµ and its associated tableau are
Thus the entries of T correspond to variables, and its rows of T correspond to factors. Observe that all the
1’s in T must be in the first column; all the 2’s must be in the first or second column; etc. Thus, for every j,
there must be collectively enough boxes in the first j columns of T to hold all the entries of T corresponding
to the variables x1 , . . . , xj . That is,
which is precisely the condition λ̃ D µ. If this fails, then no λ-factorization of xµ can exist and bλ,µ = 0.
If λ̃ = µ, then every inequality in (9.1) is in fact an equality, which says that every entry in the jth column
is in fact j. That is, there is exactly one λ-factorization of xµ , and bλ,µ = 1.
Therefore, if we order partitions of n by any linear extension of dominance (such as lexicographic order),
then the matrix [bλ,µ ] will be upper unitriangular, hence invertible over any integral domain R. (This is the
same argument as in Corollary 8.10.6.) It follows that the R-module spanned by the eλ ’s is the same as that
spanned by the mµ ’s for any R, so {eλ } is an integral basis.
Corollary 9.4.3 (“Fundamental Theorem of Symmetric Functions”). The elementary symmetric functions
e1 , e2 , . . . are algebraically independent. Therefore, Λ = R[e1 , e2 , . . . ] as rings.
Proof. Given any nontrivial polynomial relation among the ei ’s, extracting the homogeneous pieces would
give a nontrivial linear relation among the eλ ’s, which does not exist.
Definition 9.5.1. The kth complete homogeneous symmetric function hk is the sum of all monomials
of degree k, extended multiplicatively to partitions:
h0 = 1,
X X
hk = x i1 x i2 · · · x ik = mλ for k > 0,
0<i1 ≤i2 ≤···≤ik λ`k
171
and eλ , but instead will take a different approach that exploits the close relation between the two families.
Consider the generating functions
X X
E(t) = tk ek , H(t) = tk hk .
k≥0 k≥0
We regard E(t) and H(t) as formal power series in t whose coefficients are themselves formal power series
in {xi }. Observe that
Y Y 1
E(t) = (1 + txi ), H(t) = . (9.2)
1 − txi
i≥1 i≥1
In the formula for H(t), each factor in the infinite product is a geometric series 1 + txi + t2 x2i + · · · , so
[tk ]H(t) is the sum of all monomials of degree k. It is immediate from (9.2) that
H(t)E(−t) = 1
and extracting the coefficients of positive powers of t gives the Jacobi-Trudi relations: for every n ≥ 0,
n
X
(−1)k ek hn−k = 0 ∀n ≥ 0. (9.3)
k=0
That is,
h1 − e1 = 0, h2 − e1 h1 + e2 = 0, h3 − e1 h2 + e2 h1 − e3 = 0, ...
(where we have plugged in h0 = e0 = 1). The Jacobi-Trudi equations can be used iteratively to solve for the
ek in terms of the hk :
e1 = h1 ,
e2 = e1 h1 − h2 = h21 − h2 ,
(9.4)
e3 = e2 h1 − e1 h2 + h3 = h1 (h21 − h2 ) − h2 h1 + h3 = h31 − 2h1 h2 + h3 ,
e4 = e3 h1 − e2 h2 + e1 h3 − h4 = h41 − 3h21 h2 + h22 + 2h1 h3 − h4 ,
etc. Since the Jacobi-Trudi relations are symmetric in the letters h and e, so are the equations (9.4).
Therefore, the elementary and homogenous functions generate the same ring.
Here is another way to see that the h’s are an integral basis, which again exploits the symmetry of the
Jacobi-Trudi relations. Define a ring endomorphism ω : Λ → Λ by
ω(ei ) = hi (9.5)
for all i, so that ω(eλ ) = hλ . This map, sometimes known as the Hall transformation 3 but more usually
just referred to as ω, is well-defined since the elementary symmetric functions are algebraically independent.
Now Corollary 9.5.2 follows from the following result:
Proposition 9.5.3. ω is an involution: ω(ω(f )) = f for all f ∈ Λ. In particular, ω is a ring automorphism.
3 As of November 2020, do not Google the phrase “Hall transformation.” You have been warned.
172
Proof. Applying ω to the Jacobi-Trudi relations (9.3), we see that for every n ≥ 1,
n
X n
X
0 = (−1)n−k ω(ek )ω(hn−k ) = (−1)n−k hk ω(hn−k )
k=0 k=0
Xn
= (−1)k hn−k ω(hk ) (by replacing k with n − k)
k=0
n
X
= (−1)n (−1)n−k hn−k ω(hk )
k=0
and comparing this last expression with the original Jacobi-Trudi relations gives ω(hk ) = ek (e.g., because
solving for ω(hk ) in terms of the hk ’s gives exactly (9.4), with the ek ’s replaced by ω(hk )’s).
Definition 9.6.1. The kth power-sum symmetric function pk is the sum of the kth powers of all
variables, extended multiplicatively to partitions:
∞
X
pk = xki = mk ,
i=1
pλ = pλ1 · · · pλ` for λ = (λ1 , . . . , λ` ) ∈ Par.
We have seen this transition matrix before: its columns are characters of tabloid representations! (See (8.23).)
This is the first explicit connection we can observe between representations of Sn and symmetric functions,
and it is the tip of an iceberg. It is actually not hard to prove.
Theorem 9.6.2. For λ ` n, we have X
pλ = τµ (Cλ )mµ
µ`n
where τµ (Cλ ) means the character of the tabloid representation of shape µ on the conjugacy class Cλ of
cycle-shape λ, as in §8.10.
Proof.QLet λ = (λ1 , . . . , λ` ) and µ = (µ1 , . . . , µk ). We adopt the notation (8.19). As in Theorem 9.4.2, let
xµ = i xµi i . We calculate the coefficient
[xµ ]pλ = [xµ ]pλ1 · · · pλ`
= number of λ-factorizations of xµ = xλc11 · · · xλc`` .
173
Here we will represent each such choice by a tabloid T in which the factor xλcii contributes labels Li to the
ci th row, so that T has shape µ. Thus the rows of T correspond to variables, while the entries correspond
to positions in the factorization (in contrast to the construction of Theorem 9.4.2).
For example, let λ = (2, 1, 1, 1) and µ = (3, 2). Then [xµ ]pλ = [x31 x22 ]pλ = 4. The four λ-factorizations of xµ
are shown below with their corresponding tabloids.
These are precisely the tabloids in which each interval Li is contained in a single row, and these are precisely
those fixed by the permutation given in cycle notation as
whose cycle-shape is λ. (Compare equation (8.20). In the example above, w is the transposition (1 2).) In
particular, the number of such tabloids is by definition τµ (Cλ ).
Corollary 9.6.3. {pλ } is a basis for the symmetric functions (although not an integral basis).
Proof. By Proposition 8.10.5, the transition matrix [τµ (Cλ )] from the monomial symmetric functions to the
power-sums is triangular, hence invertible (although not unitriangular).
The definition of Schur symmetric functions is very different from the m’s, e’s, h’s and p’s. It is not even
clear at first that they are symmetric. But in fact the Schur functions turn out to be essential in the study
of symmetric functions and in several ways are the “best” basis for Λ.
Definition 9.7.1. A column-strict tableau T of shape λ, or λ-CST for short, is a labeling of the boxes
of the Ferrers diagram of λ with integers (not necessarily distinct) that is
The partition λ is called the shape of T , and the set of all column-strict tableaux of shape λ is denoted
CST(λ). The content of a CST is the sequence α = (α1 , α2 , . . . ), where αi is the number of boxes labelled i,
and the weight of T is the monomial xT = xα = xα 1 α2
1 x2 · · · (the same information as the content, but in
monomial form). For example:
1 1 3 1 1 1 1 2 3
2 3 4 8 1 4
The terminology is not entirely standardized; column-strict tableaux are often called “semistandard tableaux”
(as in, e.g. [Sta99]).
174
Definition 9.7.2. The Schur function corresponding to a partition λ is
X
sλ = xT .
T ∈CST(λ)
At the other extreme, suppose that λ = (1, 1, . . . , 1) is the partition with n singleton parts, so that the
corresponding Ferrers diagram has a single column. To construct a CST of this shape, we need n distinct
labels, which can be arbitrary. Therefore
J
Example 9.7.4. Let λ = (2, 1). We will express sλ as a sum of the monomial symmetric functions
m3 , m21 , m111 .
First, no tableau of shape λ can have three equal entries, so the coefficient of m3 is 0.
Second, for weight xa xb xc with a < b < c, there are two possibilities, shown below.
a b a c
c b
Third, for every a 6= b ∈ N>0 , there is one tableau of shape λ and weight x2a xb : the one on the left if a < b,
or the one on the right if a > b.
a b b b
b a
It should be evident at this point that the Schur functions are quasisymmetric, i.e., that for every monomial
xai11 · · · xaikk (where i1 < · · · < ik ), its coefficient in sλ depends only on the ordered sequence (a1 , . . . , ak ). To
see this, observe that if j1 < · · · < jk , then replacing is with js for all s gives a bijection from λ-CSTs with
weight xai11 · · · xaikk to λ-CSTs with weight xaj11 · · · xajkk .
175
In fact, the Schur functions are symmetric. Here is an elementary proof. It is enough to show that sλ is
invariant under transposing xi and xi+1 for every i ∈ N>0 , since those transposition generate S∞ . Let
T ∈ CST(λ) and consider all the entries equal to i or +1, ignoring columns that contain both i and i + 1.
For ease in depicting the the set of such entries in a single row looks like
Say that there are a instances of i and b instances of i + 1. Then we can replace this part of the tableau with
b instances of i and a instances of i + 1. Doing this for every row gives a shape-preserving bijection between
tableaux of weight · · · xpi xqi+1 · · · and those of weight · · · xqi xpi+1 · · · , as desired.
An important generalization of a Schur function involves a generalization of the underlying Ferrers diagram
of a tableau.
Definition 9.7.5. Let λ, µ be partitions with µ ⊆ λ, i.e., λi ≥ µi for all i. There is then an associated
skew partition or skew shape λ/µ, defined via its skew Ferrers diagram, in which the ith row has
boxes in columns µi + 1, . . . , λi . A skew tableau of shape λ/µ is a filling of the skew Ferrers diagram
with numbers.
Some skew shapes are shown below; note that disconnected skew shapes are possible.
4421/21 = 4421/321 = 4421/322 =
The notion of a column-strict tableau carries over without change to skew shapes. Here is a CST of shape
λ/µ, where λ = (8, 6, 6, 5, 4, 2) and µ = (5, 3, 3, 3, 2):
2 2 3
1 1 3
2 3 4
4 4
2 6
1 1
The definition of Schur functions (Definition 9.7.1) can also be adapted to skew shapes.
Definition 9.7.6. Let CST(λ/µ) denote the set of all column-strict skew tableaux of shape λ/µ, and as
Q α (T )
before weight each tableau T ∈ CST(λ/µ) by the monomial xT = i xi i , where αi (T ) is the number of
i’s in T . The skew Schur function is then
X
sλ/µ = xT .
T ∈CST(λ/µ)
The elementary proof of symmetry of Schur functions carries over literally to skew Schur functions.
We are next going to establish a formula for the Schur function sλ as a determinant of a matrix whose entries
are hn ’s or en ’s (which also proves their symmetry). This takes more work, but the proof, due to the ideas of
176
(6, ∞)
6
5
4
4
3
3 3
2
1
1
0
0 1 2 3 4 5 6 7 8
Figure 9.1: A lattice path P from (1, 0) to (6, ∞) with weight xP = x1 x23 x4 x6 .
Lindström, Gessel, and Viennot, is drop-dead gorgeous, and the formula has many other useful consequences
(such as what the involution ω does to Schur functions). This exposition follows closely that of [Sag01, §4.5].
Theorem 9.8.1. For any λ = (λ1 , . . . , λ` ) we have
sλ = det hλi −i+j i,j=1,...,` (9.8)
and
sλ̃ = det eλi −i+j i,j=1,...,` . (9.9)
In particular, the Schur functions are symmetric.
For example,
h3 h4 h5 h3 h4 h5
s311 = h0 h1 h2 = 1 h1 h2 = h311 + h5 − h41 − h32 .
h−1 h0 h1 0 1 h1
Proof. We prove (9.8) in detail, and then discuss how the proof can be modified to prove (9.9).
We will consider lattice paths P that start at some point on the x-axis in Z2 and move north or east one unit
at a time. For every path that we consider, the number of eastward steps must be finite, but the number of
northward steps is infinite. Thus the “ending point” is (x, ∞) for some x ∈ N. Label each eastward Q step e
of P by the number L(e) that is its y-coordinate plus one. The weight of P is the monomial xP = e xL(e) .
An example is shown in Figure 9.1.
The monomial xP determines the path P up to horizontal shifting, and xP can be any monomial. Thus we
177
have a bijection, and it follows that for any a ∈ N,
X X
hn = xP = xP . (9.10)
paths P from paths P with fixed starting
(a, 0) to (a + n, ∞) point with n east steps
Step 2: Express the generating function for families of lattice paths in terms of the hk ’s.
For a partition λ of length `, a λ-path family P = (π, P1 , . . . , P` ) consists of the following data:
• A permutation π ∈ S` ;
• Two sets of points U = {u1 , . . . , u` } and V = {v1 , . . . , v` }, defined by
Figure 9.2 shows a λ-path family with λ = (3, 3, 2, 1) and π = 3124. (In general the paths in a family are
allowed to share edges, although that is not the case in this example.)
v4 v3 v2 v1
4
P4 P3 P2 P1
3
1
u4 u3 u2 u1
0
0 1 2 3 4 5 6 7 8
Note that for each i ∈ [`], the number of east steps in the path Pi from u(π(i)) to vi is
Now the first miracle occurs: the signed generating function for path families is the determinant of a matrix
whose entries are complete homogeneous symmetric functions! One key observation is that any collection of
paths P1 , . . . , P` in which Pi contains λi − i + π(i) east steps gives rise to a λ-path family (π, P1 , . . . , P` ). In
other words, if we know what π is, then Pi can be any path with the appropriate number of east steps.
Q`
For a path family P = (π, P1 , . . . , P` ), let xP = i=1 xPi and (−1)P be the sign of π. Then:
178
X X X
(−1)P xP = ε(π) xP1 · · · xP`
P=(π,P1 ,...,P` ) π∈S` λ-path families
P=(π,P1 ,...,P` )
X `
Y X
xPi
= ε(π)
(by the key observation above)
π∈S` i=1 paths Pi with
λi −i+π(i) east steps
X `
Y
= ε(π) hλi −i+π(i) (by (9.10))
π∈S` i=1
Call a path family good if no two of its paths meet in a common vertex, and bad otherwise. Note that if P
is good, then π must be the identity permutation, and in particular (−1)P = 1.
1. Of all the lattice points contained in two or more paths in P, choose the point α with the lex-greatest
pair of coordinates.
2. Of all the half-paths from α to some vi , choose the two with the largest i. Interchange them. Call the
resulting path family P] .
v4 v3 v2 v1 v4 v3 v2 v1
5 5
4 4
3 P 3 P]
2 2
α α
1 1
u4 u3 u2 u1 u4 u3 u2 u1
0 0
0 1 2 3 4 5 6 0 1 2 3 4 5 6
Then
179
]
• (−1)P = −(−1)P (because the two are related by a transposition).
For each good path family, label the east steps of each path by height as before. The labels weakly increase
as we move north along each path. Moreover, for every j the jth east step of the path Pi occurs one unit
east of that of Pi+1 , so it must also occur strictly north of it (otherwise, the paths would cross). Therefore,
we can construct a column-strict tableau of shape λ by reading off the labels of each path, and this gives
a bijection between good λ-path families and column-strict tableaux of shape λ. An example is shown in
Figure 9.4.
v4 v3 v2 v1
1 1 3
4 4
2 3 4
3 4
3 3 3
5
1 1
u4 u3 u2 u1
Figure 9.4: The bijection between good path families and column-strict tableaux.
Consequently, (9.12) implies that |hλi −i+j |i,j=1,...,` = sλ , which is (9.8). Is that amazing or what?
The proof of (9.9) is similar. The key difference is that instead of labeling each east step with its height,
we number all the steps (north and east) consecutively, ignoring the first i − 1 steps of Pi (those below the
line y = x + ` − 1, which must all be northward anyway). The weight of a path is still the the product of
the variables corresponding to its east steps. This provides a bijection between lattice paths with k east
steps and squarefree monomials of degree k, giving an analogue of (9.10), with hn replaced by en . Bad path
families cancel out by the same involution as before, and each good path family now gives rise to a tableau
of shape λ in which rows strictly increase but columns weakly increase (see Figure 9.5). Transposing gives
a column-strict tableau of shape λ̃, and (9.9) follows.
Corollary 9.8.2. For every partition λ, the involution ω interchanges sλ and sλ̃ .
Proof. We know that ω interchanges hλ and eλ , so it interchanges the RHS’s, hence the LHS’s, of (9.8) and
(9.9).
180
v4 v3 v2 v1
1 1 2 5
3 5
1 3 5
2 4 1 3
1 3 5
2
2 4
1
3
1 2
u4 u3 u2 u1
Figure 9.5: The dual bijection between good path families and row-strict tableaux.
The next step is to prove that the Schur functions are a basis for the symmetric functions. Now that we
know they are symmetric, they can be expressed in the monomial basis as
X
sλ = Kλ,µ mµ . (9.13)
µ`n
Thus Kλ,µ is the number of column-strict tableaux T with shape λ and content µ. These are called the
Kostka numbers.
Theorem 9.8.3. The Schur functions {sλ : λ ` n} are a Z-basis for ΛZ .
Proof. Here comes one of those triangularity arguments. First, if λ = µ, then there is exactly one possibility
for T : fill the ith row full of i’s. Therefore
∀λ ` n : Kλ,λ = 1. (9.14)
Second, observe that if T is a CST of shape λ and content µ (so in particular Kλ,µ > 0), then
The lattice-path proof of Theorem 9.8.1 generalizes to skew shapes (although I haven’t yet figured out exactly
how) to give Jacobi-Trudi determinant formulas for skew Schur functions:
sλ/µ = det hλi −µi −i+j i,j=1,...,` , sλ̃/µ̃ = det eλi −µi −i+j i,j=1,...,` . (9.15)
181
9.9 The Cauchy kernel and the Hall inner product
The next step in studying the ring of symmetric functions Λ will be to define an inner product structure on
it. These will come from considering the Cauchy kernel and the dual Cauchy kernel, which are formal
power series in two sets of variables x = {x1 , x2 , . . . }, y = {y1 , y2 , . . . }, defined as the following infinite
products: Y Y
Ω= (1 − xi yj )−1 , Ω∗ = (1 + xi yj ).
i,j≥1 i,j≥1
The power series Ω and Ω∗ are well-defined because the coefficient of any monomial xα yβ is the number
of ways of factoring it into monomials of the form xi yj , which is clearly finite (in particular it is zero if
6 |β|). Moreover, they are evidently bisymmetric4 , i.e., symmetric with respect to each of the variable
|α| =
sets x = {x1 , x2 , . . . } and y = {y1 , y2 , . . . }. Thus we can write Ω and Ω∗ as power series in some basis for
Λ(x) and ask which elements of Λ(y) show up as coefficients.
For later use, we observe that Ω and Ω∗ can be viewed as generating functions for infinite matrices, as follows.
Let A = {aij : i, j ∈ N>0 } be a matrix with countably infinitely many rows and columns, Q both indexed by
positive integers, such that all but finitely many values of aij are zero. Define wt(A) = i,j (xi yj )aij . Then
expanding the geometric-series factors in the Cauchy kernel and dual Cauchy kernel gives
X Y X Y
Ω= (xi yj )aij , Ω∗ = (xi yj )aij .
A: aij ∈N i,j A: aij ∈{0,1} i,j
For example, if λ = (3, 3, 2, 1, 1, 1) then zλ = (13 3!)(21 1!)(32 2!) = 216 and ελ = −1.
Proposition 9.9.1. Let λ ` n and let Cλ be the corresponding conjugacy class in Sn . Then |Cλ | = n!/zλ ,
and ελ is the sign of each permutation in Cλ .
182
− xi t)−1 = hk (x)tk . Therefore
Q P
Proof. Recall from (9.2) that i≥1 (1 k≥0
Y Y Y
−1
(1 − xi yj ) = (1 − xi t)−1
t=yj
i,j≥1 j≥1 i≥1
Y X
= hk (x) yjk
j≥1 k≥0
X ∞
Y
= yα hαi (x)
α=(α1 ,α2 ,... )∈NN :
P i=1
i αi <∞
X
= yα hλ(α) (x)
α
X
= hλ (x)mλ (y)
λ
as desired. (Regard λ as the partition whose parts are α1 , . . . , α` , sorted in weakly decreasing order.)
For the second equality in (9.17), recall the standard power series expansions
X qn X qn X qn
log(1 + q) = (−1)n+1 , log(1 − q) = − , exp(q) = . (9.19)
n n n!
n≥1 n≥1 n≥0
Q P
These are formal power series that obey the rules you would expect; for instance, log( i qi ) = i (log qi )
and exp log(q) = q. (The proof of the second of these is left to the reader as Exercise 9.4.) In particular,
Y X
log Ω = log (1 − xi yj )−1 = − log(1 − xi yj )
i,j≥1 i,j≥1
X X xni yjn
= (by (9.19))
n
i,j≥1 n≥1
X1 X X X pn (x)pn (y)
= xni yjn =
n n
n≥1 i≥1 j≥1 n≥1
and now exponentiating both sides and applying the power series expansion for exp, we get
k
X pn (x)pn (y) X 1 X pn (x)pn (y)
Ω = exp =
n k! n
n≥1 k≥0 n≥1
X 1 r1 r2
X k! p1 (x)p 1 (y) p2 (x)p2 (y)
= ···
k! r1 ! r2 ! · · · 1 2
k≥0 λ: `(λ)=k
i=1
The proofs of the identities for the dual Cauchy kernel are analogous, and are left to the reader as Exercise 9.5.
183
As a first benefit, we can express the homogeneous and elementary symmetric functions in the power-sum
basis.
Corollary 9.9.3. For all n, we have:
P
(a) hn = λ`n pλ /zλ ;
P
(b) en = λ`n ελ pλ /zλ ;
(c) ω(pλ ) = ελ pλ (where ω is the involution of 9.5).
Set y1 = t, and yk = 0 for all k > 1. This kills all terms on the left side for which λ has more than one part,
leaving only those where λ = (n), while on the right side pλ (y) specializes to t|λ| , so we get
X X pλ (x)t|λ|
hn (x)tn =
n
zλ
λ
(c) Let ω act on symmetric functions in x while fixing those in y. Using (9.17) and (9.18), we obtain
! !
X pλ (x) pλ (y) X X X pλ (x)pλ (y)
= hλ (x)mλ (y) = ω eλ (x)mλ (y) = ω ελ
zλ zλ
λ λ λ λ
X ελ ω(pλ (x)) pλ (y)
=
zλ
λ
and equating the red coefficients of pλ (y)/zλ yields the desired result.
Definition 9.9.4. The Hall inner product on symmetric functions is defined by declaring {hλ } and {mλ }
to be dual bases. That is, we define
hhλ , mµ iΛ = δλµ
and extend by linearity to all of Λ.
Thus the Cauchy kernel can be regarded as a generating function for pairs (hλ , mµ ), weighted by their inner
product. In fact it can be used more generally to compute Hall inner products:
Proposition 9.9.5. The Hall inner product has the following properties:
P
(a) If {uλ } and {vµ } are graded bases for Λ indexed by partitions, such that Ω = λ uλ (x)vλ (y), then
they are dual bases with respect to the Hall inner product; i.e., huλ , vµ i = δλµ .
√
(b) In particular, {pλ } and {pλ /zλ } are dual bases, and {pλ / zλ } is self-dual, i.e., orthonormal.
(c) h·, ·i is a genuine inner product (in the sense of being a nondegenerate bilinear form).
(d) The involution ω is an isometry with respect to the Hall inner product, i.e.,
184
Proof. Assertion (a) is a matter of linear algebra, and is left to the reader (Exercise 9.1). Assertion (b)
follows from (a) together with (9.18), and (c) from the fact that ΛR admits an orthonormal basis. The
quickest proof of (d) uses the power-sum basis: by Corollary 9.9.3(c), we have
√
The orthonormal basis {pλ / zλ } is not particularly nice from a combinatorial point of view, because it
involves irrational coefficients. It turns out that there is a better orthonormal basis: the Schur functions!
Our next goal will be to prove that
Y 1 X
Ω = = sλ (x)sλ (y) (9.20)
1 − xi yj
i,j≥1 λ
Recall from Example 1.2.4 that a standard [Young] tableau of shape λ is a filling of the Ferrers diagram
of λ with the numbers 1, 2, . . . , n that is increasing left-to-right and top-to-bottom. We write SYT(λ) for the
set of all standard tableaux of shape λ, and set f λ = |SYT(λ)| (this symbol f λ is traditional).
For example, if λ = (3, 3), then f λ = 5; the members of SYT(λ) are as follows.
1 3 5 1 3 4 1 2 5 1 2 4 1 2 3
2 4 6 2 5 6 3 4 6 3 5 6 4 5 6
• If T = ∅, then T ← x = x .
• If x ≥ u for all entries u in the top row of T , then append x to the end of the top row.
• Otherwise, find the leftmost entry u such that x < u. Replace u with x, and then insert u into the
subtableau consisting of the second and succeeding rows. In this case we say that x bumps u.
• Repeat until the bumping stops.
185
Example 9.10.2. Let w = 57214836 ∈ S8 . Start with a pair (P, Q) of empty tableaux.
Step 1: Row-insert w1 = 5 into P . We do this in the obvious way. Since it is the first cell added, we add a
cell containing 1 to Q.
P = 5 Q= 1 (9.21a)
Step 2: Row-insert w2 = 7 into P . Since 5 < 7, we can do this by appending the new cell to the top row,
and adding a cell labeled 2 to Q to record where we have put the new cell in P .
P = 5 7 Q= 1 2 (9.21b)
Step 3: Row-insert w3 = 2 into P . This is a bit trickier. We cannot just append a 2 to the first row of P ,
because the result would not be a standard tableau. The 2 has to go in the top left cell, but that already
contains a 5. Therefore, the 2 “bumps” the 5 out of the first row into a new second row. Again, we record
the location of the new cell by adding a cell labeled 3 to Q.
P = 2 7 Q= 1 2 (9.21c)
5 3
Step 4: Row-insert w4 = 1 into P . This time, the new 1 bumps the 2 out of the first row. The 2 has to go
into the second row, but again we cannot simply append it to the right. Instead, the 2 bumps the 5 out of
the second row into the (new) third row.
1 7 1 2
P = Q= (9.21d)
2 3
5 4
Step 5: Row-insert w5 = 4 into P . The 4 bumps the 7 out of the first row. The 7, however, can comfortably
fit at the end of the second row, without any more bumping.
P = 1 4 Q= 1 2 (9.21e)
2 7 3 5
5 4
Step 6: Row-insert w6 = 8 into P . The 8 just goes at the end of the first row.
1 4 8 1 2 6
P = Q= (9.21f)
2 7 3 5
5 4
186
P = 1 3 8 Q= 1 2 6 (9.21g)
2 4 3 5
5 7 4 7
1 3 6 1 2 6
P = Q= (9.21h)
2 4 8 3 5 8
5 7 4 7
J
A crucial feature of the RSK correspondence is that it can be reversed. That is, given a pair (P, Q), we can
recover the permutation that gave rise to it.
Example 9.10.3. Suppose that we are given the pair of tableaux in (9.21h). What was the previous step?
To get the previous Q, we just delete the 8. As for P , the last cell added must be the one containing 8.
This is in the second row, so somebody must have bumped 8 out of the first row. That somebody must be
the largest number less than 8, namely 6. So 6 must have been the number inserted at this stage, and the
previous pair of tableaux must have been those in (9.21g). J
Example 9.10.4. Let P be the standard tableau (with 18 boxes) shown in (a) below. Suppose that we
know that the cell labeled 16 was the last one added (because the corresponding cell in Q contains an 18).
Then the “bumping path” must be as indicated in the center figure (b). (That is, the 16 was bumped by
the 15, which was bumped by the 13, and so on.) Each number in the bumping path is the rightmost one
in it slow that is less than the next lowest number in the path. The previous tableau in the RSK algorithm
can now be found by “unbumping”: push every number in the bumping path up and toss out the top one,
to obtain the tableau on the right (c).
1 2 5 8 10 18 1 2 5 8 10 18 1 2 5 8 12 18
(a) (b) (c)
3 4 11 12 19 3 4 11 12 19 3 4 11 13 19
6 7 13 6 7 13 6 7 15
9 15 17 9 15 17 9 16 17
14 16 14 16 14
Iterating this procedure allows us to recover w from the pair (P, Q). J
187
Example 9.10.7. The SYT’s with n = 3 boxes are as follows:
1 2 3 1 2 1 3 1
3 2 2
3
So
f (4) = 1, f (3,1) = 3, f (2,2) = 2, f (2,1,1) = 3, f (1,1,1,1) = 1.
and the sum of the squares of these numbers is 24. J
We have seen these numbers before — they are the dimensions of the irreps of S3 and S4 , as calculated in
Examples 8.7.1 and 8.7.2. Hold that thought!
The proof is in [Sta99, §7.13]; I hope to understand and write it up some day. It is certainly not obvious
from the standard RSK algorithm, where it looks like P and Q play inherently different roles. In fact, they
are more symmetric than they look. There are alternative descriptions of RSK from which the symmetry is
more apparent, also described in [Sta99, §7.13] and in [Ful97, §4.2].
The RSK correspondence can be extended to more general tableaux. This turns out to be the key to
expanding the Cauchy kernel in terms of Schur functions.
Definition 9.10.10. A generalized permutation of length n is a 2 × n matrix
q q q2 · · · qn
w= = 1 (9.22)
p p1 p2 · · · pn
where q = (q1 , . . . , qn ), p = (p1 , . . . , pn ) ∈ Nn>0 , and the (q1 , p1 ), . . . , (qn , pn ) are in lexicographic order.
(That is, q1 ≤ · · · ≤ qn , and if qi = qi+1 then pi ≤ pi+1 .) The weight of w is the monomial xP yQ =
xp1 · · · xpn yq1 · · · yqn . The set of all generalized permutations will be denoted GP, and the set of all generalized
permutations of length n will be denoted GP(n).
q
If qi = i for all i and the pi ’s are pairwise distinct elements of [n], then w = p is just an ordinary
permutation in Sn , written in two-line notation.
The generalized RSK algorithm (gRSK) is defined in exactly the same way as original RSK, except
that the inputs are now allowed to be generalized permutations rather than ordinary permutations. At the
ith stage, we row-insert pi in the insertion tableau P and place qi in the recording tableau Q in the new cell
added.
188
Example 9.10.11. Consider the generalized permutation
q 1 1 2 4 4 4 5 5 5
w= = ∈ GP(9).
p 2 4 1 1 3 3 2 2 4
The result of the gRSK algorithm is as follows. The unnamed tableau on the right records the step in which
each box was added.
P = 1 1 2 2 4 Q= 1 1 4 4 5 1 2 5 6 9
2 3 3 2 4 5 3 4 8
4 5 7
J
The tableaux P, Q arising from gRSK will always have the same shape as each other, and will be weakly
increasing eastward and strictly increasing southward — that is, they will be column-strict tableaux, precisely
the things for which the Schur functions are generating functions. Column-strictness of P follows from the
definition of insertion. As for Q, it is enough to show that no label k appears more than once in the same
column. Indeed, all instances of k in q occur consecutively (say as qi , . . . , qj ), and the corresponding entries
of p are weakly increasing, so none of them will bump any other (in fact their bumping paths will not cross),
which means that each k appears to the east of all previous k’s.
This observation also suffices to show that the generalized permutation w can be recovered from the pair
(P, Q): the rightmost instance of the largest entry in Q must have been the last box added. Hence the
corresponding box of P can be “unbumped” to recover the previous P and thus the last column of w.
Iterating this process allows us to recover w. Therefore, generalized RSK gives a bijection
RSK
[
GP(n) −−−−→ {(P, Q) : P, Q ∈ CST(λ)} (9.23)
λ`n
q
maps to a pair of tableaux P, Q with weight monomials xP and yQ .
in which a generalized permutation p
On the other hand, a generalized permutation w = pq ∈ GP(n) is determined by the number of occurrences
of every column pqii . Therefore, it can be specified by an infinite matrix M = [mij ]i,j∈N>0 with finitely many
P
nonzero entries, in which mij is the number of occurrences of (qi , pj ) in w (so n = mij ). For example, the
generalized permutation w ∈ GP(9) of Example 9.10.11 corresponds to the integer matrix
0 1 0 1 0 ···
1 0 0 0 0 · · ·
0 0 0 0 0 · · ·
1 0 2 0 0 · · ·
.
0 2 0 1 0 · · ·
0 0 0 0 0 · · ·
.. .. .. .. .. . .
. . . . . .
Rewrite this and the previous discussion of the Cauchy kernel to get rid of the matrices; it’s
as easy to work with generalized permutations. Let M denote the set of all such Q matrices, so that
we have a bijection GP ↔ M. Under this bijection, the weight monomial xP yQ equals i,j (xi yj )mij . Any
given weight monomial arises from only finitely many matrices, so the generating function for matrices by
weights is a well-defined power series. In fact, it is the Cauchy kernel, because
∞
X Y Y X Y 1
(xi yj )mij = (xi yj )mij = = Ω.
1 − xi yj
M =[mij ]∈M i,j i,j≥1 mij =0 i,j≥1
189
On the other hand,
X Y
Ω = (xi yj )mij
M =[mij ] i,j
X X
= xp1 · · · xpn yq1 · · · yqn (by the bijection GP ↔ M)
n∈N (q)∈GP(n)
p
X X
= xP y Q (by gRSK)
λ P,Q ∈ CST(λ)
X X X
= xP yQ
λ P ∈CST(λ) Q∈CST(λ)
X
= sλ (x)sλ (y).
λ
In fact, the Schur functions are the only orthonormal, integral basis for Λ that is positive in the underlying
variables x.
Corollary 9.10.13. For every µ ∈ Par we have
X X
hµ = Kλµ sλ and eµ = Kλ̃µ sλ .
λ λ
for every λ. So the two blue expressions are equal. Applying ω to both sides (using Prop. 9.9.5(d)) we get
* +
X
hsλ , eµ i = ω(sλ ), ω(eµ ) = hsλ̃ , hµ i = Kλ̃,µ = sλ , Kλ̃µ sλ (9.25)
λ
For this section, we will work with polynomials rather than power series, for a reason that will quickly
become apparent.
190
The definition implies that every alternant is divisible by xj − xi for each i < j, hence by the Vandermonde
determinant
1 x1 x21 · · · xn−1 1
Y 1 x2 x22 · · · xn−1 2
(xj − xi ) = . .. .. .. .
.. . . .
1≤i<j≤n
1 xn x2n ··· xn−1
n
(Why does this equality hold? Interchanging xi with xj swaps two rows of the determinant, hence changes its
sign. Therefore the determinant
is divisible by the product on the left. On the other hand both polynomials
are homogeneous of degree n2 = 0 + 1 + · · · + (n − 1), and the coefficients of x01 x12 · · · xn−1
n are both ±1,
so equality must hold.) This is whyQwe are working with polynomials (since the infinite analogue of the
Vandermonde determinant, namely, i<j∈N xj xi , is not a well-defined power series).
More generally, we can construct an alternant by changing the powers of variables that occur in each column
of the Vandermonde determinant: for α = (α1 , . . . , αn ) ∈ Nn , we define
α n
X
aα = aα (x1 , . . . , xn ) = xi j i,j=1 = ε(w)w(xα ).
w∈Sn
Note that aα = 0 if (and only if) α contains some entry more than once. Moreover, the entries might as well
be listed in decreasing order. Therefore we can write α = λ + δ, where λ = (λ1 ≥ · · · ≥ λn ≥ 0) ∈ Par and
δ = (n − 1, n − 2, . . . , 1, 0), and addition is componentwise: αj = λj + δj = λj + n − j. That is,
n
λ +n−j
aλ+δ = xi j .
i,j=1
Proof. In light of the second assertion of Corollary 9.10.13 and the invertibility of the matrix [Kλµ ], it is
equivalent to show that for every µ = (µ1 , . . . , µk ) we have
X
eµ = Kλ̃µ aλ+δ /aδ
λ
or equivalently X
aδ eµ = Kλ̃µ aλ+δ .
λ
Both sides of the equation are alternating, so it is enough to show that for every λ, the monomial xλ+δ has the
same coefficient on both sides of this equation. On the RHS this coefficient is Kλ̃µ since the monomial only
appears in the λ summand. On the LHS, the coefficient [xλ+δ ]aδ eµ is the sum of ε(w) over all factorizations
1 k 1 k
xλ+δ = w(xδ ) xβ · · · xβ = x0w(1) x1w(2) · · · xn−1
w(n) x
β
· · · xβ .
i
where each xβ is a squarefree monomial of degree µi . Denote such a factorization by f (w, β) = f (w, β 1 , . . . , β k ),
and denote by F the set of all such factorizations. Thus we are trying to prove that
X
ε(w) = Kλ̃µ . (9.26)
f (w,β)∈F
191
1 j
Let f (w, β)j denote the partial product w(xδ )xβ · · · xβ . For a monomial M , let powxi (M ) denote the
power of xi that appears in M .
We now describe a sign-reversing involution on the set F . Suppose that f (w, β) is a factorization such that
for some j ∈ [k] and some a 6= b
powa (f (w, β)j ) = powb (f (w, β)j ).
i
Choose (j, {a, b}) to be lexicographically minimal). Then interchanging xa and xb in every xβ and multi-
plying w by the transposition (a b) produces another element of F and preserves the equality condition and
the pair (j, {a, b}), while flipping the sign of w.
For example, let n = 3, λ = (2, 2, 1), α = (4, 3, 1), µ = (2, 2, 1). There are eight factorizations, including
three cancelling pairs:
1 2 3
w ε(w) w(xδ ) xβ xβ xβ j, {a, b}
123 1 x21 x2 x1 x2 x1 x2 x3 −
123 1 x21 x2 x1 x2 x1 x3 x2 −
123 1 x21 x2 x1 x3 x1 x2 x2 1, {2, 3}
132 −1 x21 x3 x1 x2 x1 x2 x2 1, {2, 3}
123 1 x21 x2 x1 x2 x2 x3 x1 2, {1, 2}
213 −1 x22 x1 x1 x2 x1 x3 x1 2, {1, 2}
123 1 x21 x2 x2 x3 x1 x2 x1 1, {1, 2}
213 −1 x22 x1 x1 x3 x1 x2 x1 1, {1, 2}
The uncanceled factorizations f (w, β) are those for which every partial factorization involves distinct powers
of variables. It follows that w = Id. Otherwise, there is some pair a, b for which
powa (xδ ) = powa (f (w, β)0 ) = powb (f (w, β)0 ) = powb (xδ ) but
powa (xδ+λ ) = powa (f (w, β)k ) = powb (f (w, β)k ) = powb (xδ+λ )
i
but since the xβ are all squarefree, there must be some j such that
(by the intermediate value theorem). Therefore, the coefficient [xλ+δ ]aδ eµ we are looking for is the number
1 k
of factorizations of xλ into squarefree monomials xβ , . . . , xβ of degrees µ1 , . . . , µk so that for all j ≤ k we
have 1 j 1 j 1 j
pow1 (xβ · · · xβ ) ≥ pow2 (xβ · · · xβ ) ≥ · · · ≥ pown (xβ · · · xβ ). (9.27)
i
Thus each variable xj must show up in λi of the monomials xβ . We record the list of monomials by a
i
tableau of content µ whose entries correspond to monomials xβ and whose columns correspond to variables
xj : column j contains an i if xj occurs in xαi . Thus the tableau will have shape λ̃, and we can arrange each
column in increasing order. There are no repeats in columns because no variable occurs more than once in
i
any xβ . Moreover, if some row has a strict decrease a > b between the jth and (j + 1)st columns, this means
that there are more xj+1 ’s then xj in the first b monomials, which contradicts (9.27). Hence the tableau is
column-strict. Moreover, ever column-strict tableau of shape λ̃ and content µ gives rise to a factorization
that contributes 1 to the coefficient [xλ+δ ]aδ eµ . We conclude that the coefficient is Kλ̃µ as desired.
192
9.12 The Frobenius characteristic
As in Section 8.6, denote by C`(Sn ) the vector space of C-valued class functions on the symmetric group Sn ;
also, let C`(S0 ) = C. Define a graded vector space
M
C`(S) = C`(Sn )
n≥0
We now want to make C`(S) into a graded ring. To start, we declare that the elements of C`(S0 ) behave
like scalars. For n1 , n2 ∈ N>0 and fi ∈ C`(Sni ), we would like to define a product f1 f2 ∈ C`(Sn ), where
n = n1 + n2 . First, define a function f1 × f2 : Sn1 × Sn2 → C by
this is a class function because the conjugacy classes in G × H are just the Cartesian products of conjugacy
classes in G with those in H (this is a general fact about products of groups). The next step is to lift to
Sn . Identify Sn1 × Sn2 with the Young subgroup Sn1 ,n2 ⊆ Sn fixing each of the sets {1, 2, . . . , n1 } and
{n1 + 1, n1 + 2, . . . , n1 + n2 }. (See (8.19).) We now define the product f1 f2 ∈ C`(Sn ) by the formula for
induced characters (Proposition 8.9.4):
1 X
f1 f2 = IndSn
Sn (f1 × f2 ) = (f1 × f2 )(g −1 wg).
1 ,n2 n1 ! n2 !
g∈Sn :
g −1 wg∈Sn1 ,n2
There is no guarantee that f1 f2 is a character of Sn (unless f1 and f2 are characters), but at least this
operation is a well-defined map on class functions, and it makes C`(S) into a commutative graded C-
algebra. (It is pretty clearly bilinear and commutative; it is nontrivial but not hard to check that it is
associative.)
For a partition λ ` n, let 1λ ∈ C`(Sn ) be the indicator function on the conjugacy class Cλ ⊆ Sn , and let
For a permutation w ∈ Sn , let λ(w) denote the cycle-shape of w (so λ(w) is a partition). Define a function
ψ : Sn → Λn by
ψ(w) = pλ(w) . (9.28)
Note that ψ is a class function (albeit with values in Λ rather than in C).
Definition 9.12.1. The Frobenius characteristic is the map
ch : C`C (S) → ΛC
defined on f ∈ C`(Sn ) by
1 X X pλ
ch(f ) = hf, ψiSn = f (w) pλ(w) = f (Cλ )
n! zλ
w∈Sn λ`n
1. ch(1λ ) = pλ /zλ .
193
2. ch is an isometry, i.e., it preserves inner products:
3. ch is a ring isomorphism.
4. ch(IndSSλ χtriv ) = ch(τλ ) = hλ .
n
Sn
5. ch(IndSλ χsign ) = eλ .
6. Let χ be any character of Sn and let χsign be the sign character on Sn . Then ch(χ⊗χsign ) = ω(ch(χ)),
where ω is the involution of 9.5.
7. ch restricts to an isomorphism C`V (S) → ΛZ , where C`V (S) is the Z-module generated by irreducible
characters (i.e., the space of virtual characters).
8. The irreducible characters of Sn are {ch−1 (sλ ) : λ ` n}.
Proof. (1): Immediate from the definition. It follows that ch is (at least) a graded C-vector space isomor-
phism, since {1λ : λ ` n} and {pλ /zλ : λ ` n} are C-bases for C`(Sn ) and Λn respectively.
where the penultimate equality is (9.17) (from expanding the Cauchy kernel in the power-sum bases).
(3): Let n = j + k and let f ∈ C`(S[j] ) and g ∈ C`(S[j+1,n] ) (so that elements of these two groups commute,
and the cycle-type of a product is just the multiset union of the cycle-types). Then:
D E
ch(f g) = IndS n
Sj ×Sk (f × g), ψ (where ψ is defined as in (9.28))
S
D E n
= f × g, ResS Sj ×Sk ψ
n
(by Frobenius reciprocity)
Sj ×Sk
1 X
= f × g(w, x) · pλ(wx)
j! k!
(w,x)∈Sj ×Sk
!
1 X 1 X
= f (w) pλ(w) g(x) pλ(x) (because the power-sum basis is multiplicative)
j! k!
w∈Sj x∈Sk
= ch(f ) ch(g).
(4), (5): Denote by χntriv and χnsign the trivial and sign characters on Sn . We calculate in parallel:
1 X 1 X
= pλ(w) = ελ(w) pλ(w) (by def’n of ψ and h·, ·iSn )
n! n!
w∈Sn w∈Sn
X |Cλ | X |Cλ |
= pλ = ελ pλ
n! n!
λ`n λ`n
X pλ X pλ
= = ελ
zλ zλ
λ`n λ`n
= hn = en (by Corollary 9.9.3).
194
Now
` ` `
!
Y Y Y
hλ = hλi = ch(χλtriv
i
) = ch χλtriv
i
= ch(IndS n
Sλ χtriv )
n
(7), (8): Each of (4) and (5) says that ch−1 (ΛZ ) is contained in the space of virtual characters, because {hλ }
and {eλ } are Z-module bases for ΛZ , and their inverse images under ch are genuine characters. On the other
hand, {sλ } is also a Z-basis, so each σλ := ch−1 (sλ ) is a character. Moreover, since ch is an isometry we
have
hσλ , σµ iSn = hsλ , sµ iΛ = δλµ
which must mean that {σλ : λ ` n} is a Z-basis for C`V (Sn ), and that each σλ is either an irreducible
character or its negative. Thus, up to sign changes and permutations, the class functions σλ are just the
characters χλ of the Specht modules indexed by λ (see §8.10). That is, σλ = ±χπ(λ) , where π is a permutation
of Par preserving size.
In fact, we claim that σλ = χλ for all λ. First, we confirm that the signs are positive. We can write each
Schur function as X pµ
sλ = bλ,µ (9.29)
zµ
µ`n
−1
for some integers bλ,µ . Applying ch gives
X
σλ = bλ,µ 1µ , (9.30)
µ`n
so that bλ,µ = ±χπ(λ) (Cµ ). In particular, taking µ = (1n ), the cycle-shape of the identity permutation, we
have
bλ,(1n ) = ± dim χπ(λ) . (9.31)
On the other hand, the only power-sum symmetric function that contains the squarefree monomial x1 x2 · · · xn
is p(1n ) (with coefficient z(1n ) = n!). Extracting the coefficients of that monomial on both sides of (9.29)
gives
f λ = bλ,µ . (9.32)
In particular, comparing (9.31) and (9.32), we see that the sign ± is positive for every λ. (We also have a
strong hint that π is the identity permutation, because dim χπ(λ) = f λ .)
195
Proof. We calculate the multiplicity of each irrep in the tabloid representation using characters:
The Frobenius characteristic allows us to translate back and forth between symmetric functions and charac-
ters of symmetric groups. In particular, many questions about representations of Sn can now be answered
in terms of tableau combinatorics. Here are a few fundamental things we would like to know at this point.
1. Irreducible characters. What is the value of the irreducible character χλ = ch−1 (sλ ) on the conjugacy
class Cµ ? In other words, what is the character table of Sn ? We have worked out some examples (e.g.,
n = 3, n = 4) and know that the values are all integers, since the Schur functions are an integral basis for
Λn . A precise combinatorial formula is given by the Murnaghan-Nakayama Rule.
2. Dimensions of irreducible characters. A special case of the Murnaghan-Nakayama Rule is that the
irreducible representation with character χλ has dimension f λ , the number of standard tableaux of shape λ.
What are the numbers f λ ? There is a beautiful interpretation called the hook-length formula of Frame,
Robinson and Thrall, which again has many, many proofs in the literature.
3. Littlewood-Richardson numbers. Now that we know how important the Schur functions are from
a representation-theoretic standpoint, how do we multiply them? That is, suppose that µ, ν are partitions
with |µ| = q, |ν| = r. Then sµ sν ∈ Λq+r , so it has a unique expansion as a linear combination of Schur
functions: X
sµ sν = cλµ,ν sλ , cλµ,ν ∈ Z. (9.33)
λ
The cλµ,ν ∈ Z are called the Littlewood-Richardson numbers. They are the structure coefficients for Λ,
regarded as an algebra generated as a vector space by the Schur functions. The cλµ,ν must be integers, because
sµ sν is certainly a Z-linear combination of the monomial symmetric functions, and the Schur functions are
a Z-basis.
cλµ,ν = IndS
Sq ×Sr (χµ ⊗ χν ), χλ
n
Sn
= χµ ⊗ χν , ResS
Sq ×Sr (χλ )
n
Sq ×Sr
Any combinatorial interpretation for the numbers cλµν is called a Littlewood-Richardson rule; there are
many of them.
196
4. Transition matrices. What are the coefficients of the transition matrices between different bases of
Λn ? We have worked out a few cases using the Cauchy kernel, and we have defined the Kostka numbers to
be the transition coefficients from the m’s to the s’s (this is just the definition of the Schur functions).
We know from Theorem 9.12.2 that the irreducible characters of Sn are χλ = ch−1 (sλ ) for λ ` n. We want
to compute these numbers. Via the Frobenius characteristic, this problem boils down to finding relations
between the Schur functions and the power-sums. It turns out that the key is to express a product sν pr as
a linear combination of Schur functions (equation (9.35)).
We first state the result, then prove it. The relevant combinatorial objects are ribbons and ribbon tableau.
A ribbon (or border strip or rim hook) is a connected skew shape R with no 2×2 block. (Here “connected”
means “connected with respect to sharing edges, not just diagonals”; for example, the skew shape 21/1 =
is not considered to be connected. The height h(R) is the number of rows in the ribbon.
For each i, the cells labeled i form a ribbon. The sorted list µ of sizes of the ribbons is the content of the
ribbon tableau; here µ = (7, 7, 6, 4, 4, 3). Note that the size of Ri need not weakly decrease as i increases. Let
RT (λ, µ) denote the set of ribbon tableaux of shape λ and content µ, and for T = (R1 , . . . , Rk ) ∈ RT (λ, µ)
put
Yk
(−1)T = (−1)1+ht(Ri ) .
i=1
For example, the heights of R1 , . . . , R6 in the ribbon tableau T shown above are 4,3,3,3,2,4. There are an
odd number of even heights, so (−1)T = −1.
Proof. For the first assertion, we have λi ≤ λi−1 = νi−1 ; on the other hand, λi is obtained by adding at least
one box to νi . (In particular this interval cannot be empty — it is possible to add at least one box in the
ith row of ν without changing the (i − 1)st row, so it must be the case that νi−1 > νi .)
The second assertion asserts that the last box in the kth row of λ must be one column east and one column
south of the last box in the (k − 1)st row of ν. Indeed, any further west and R would not be connected; any
further east and it would have a 2 × 2 block.
197
Now we can state and prove the main result.
Theorem 9.14.2 (Murnaghan-Nakayama Rule (1937)). For all λ, µ ` n, the value of the irreducible char-
acter χλ on the conjugacy class Cµ is
X
χλ (Cµ ) = (−1)T .
T ∈RT (λ,µ)
where j is the sequence with a 1 in position j and 0s elsewhere. If two entries of α + rj are equal, then
the corresponding alternant is zero. Otherwise, there is some i < j such that (α + rj )j lies strictly between
αi−1 and αi , i.e.,
νi−1 + n − (i − 1) > νj + n − j + r > νi + n − i.
Therefore, sorting the parts α in descending order means moving the j th part down to position i, which
means permuting by a (j − i + 1)-cycle, which has sign (−1)j−i . That is, aα+rj = (−1)j−i aλ+δ , where
But by Lemma 9.14.1, these partitions λ are precisely the ones for which λ/ν is a ribbon with r squares
spanning rows i, . . . , j — isn’t that convenient? Combining this observation with (9.34) we get
n
X
aν+δ pr = aα pr = aα+rj
j=1
X
= (−1)ht(R)+1 aλ+δ
R,λ
where the sum runs over ribbons R of length r that can be added to ν to obtain a partition λ. Dividing
both sides by aδ and applying Theorem 9.11.2 gives
X
sν pr = (−1)ht(R)+1 sλ . (9.35)
R,λ
(This is valid on the level of power series as well as for polynomials, since it remains valid under increasing
the number of variables, so the coefficient of every monomial in the power series is equal on both sides.)
X k
Y
sν pµ = (−1)ht(Ri )+1 sλ (9.36)
R1 ,...,Rk ,λ i=1
where the sum runs over k-tuples of ribbons of lengths given by the parts of µ (in some order) that can be
added to ν to obtain λ. In particular, if ν = ∅, then this is simply the statement that T = (R1 , . . . , Rk ) is a
ribbon tableau of shape λ and content µ, and the sign is (−1)T , so we get
X X
pµ = (−1)T sλ (9.37)
λ T ∈RT (λ,µ)
198
so that
(−1)T = hpµ , sλ iΛ = hch−1 (pµ ), ch−1 (sλ )iSn (since ch−1 is an isometry)
X
T ∈RT (λ,µ)
As a first consequence, we can expand the Schur functions in the power-sum basis:
Corollary 9.14.3. For all λ ` n we have
χλ (w) pµ χλ (w) εµ pµ .
X X
sλ = and sλ̃ =
µ
zµ µ
zµ
P
Proof. Write sλ in the p-basis as µ bλµ pµ . Taking the Hall inner product of both sides with pµ gives
hsλ , pµ i = bλµ zµ , or bλµ = zµ−1 hsλ , pµ i, implying the first equality. Applying ω and invoking Corollaries 9.8.2
and 9.9.3(c) gives the second equality.
An important special case of the Murnaghan-Nakayama rule is when µ = (1, 1, . . . , 1), since then χλ (Cµ ) =
χλ (IdSn ), is just the dimension of the irreducible character χλ . On the other hand, a ribbon tableau of
content µ is just a standard tableau. So the Murnaghan-Nakayama Rule implies the following:
Corollary 9.14.4. dim χλ = f λ , the number of standard tableaux of shape λ.
Let λ ` n, let ` = `(λ), and let SYT(λ) the set of standard tableaux of shape λ, so f λ = |SYT(λ)|. In what
follows, we label the rows and columns of a tableau starting at 1. If c = (i, j) is the cell in the ith row and
jth column of a tableau T , then T (c) or T (i, j) denotes the entry in that cell.
The hook H(c) defined by a cell c = (i, j) consists of itself together with all the cells due east or due south
of it. The number of cells in the hook is the hook length, written h(c) or h(i, j). (In this section, the letter
h always refers to hook lengths, never to the complete homogeneous symmetric function.) In the following
example, h(c) = h(2, 3) = 6.
199
1 2 3 4 5
1
2 c
3
4
5
6
9 7 6 3 1
7 5 4 1
5 3 2
4 2 1
1
Before getting started, here is how not to prove the hook-length formula. Consider the discrete probability
space of all n! fillings of the Ferrers diagram of λ with the numbers 1, . . . , n. Let S be the event that a
uniformly chosen filling T is a standard tableau,
T and for each cell, let Xc be the event that T (c) is the
smallest Q
number in the hook H(c). Then S = c Xc , and Pr[Xc ] = 1/h(c). We would like to conclude that
Pr[S] = c 1/h(c), which would imply the hook-length formula. However, that inference would require that
the events Xc are mutually independent, which they certainly are not! Still, this is a nice heuristic argument
(attributed by Wikipedia to Knuth) that one can at least remember.
There are many proofs of the hook-length formula in the literature. This one is due to Greene, Nijenhuis
and Wilf [GNW79].
Proof of Theorem 9.15.1. First, observe that for every T ∈ SYT(λ), the cell c ∈ T containing the number
n = |λ| must be a corner of λ (i.e., the rightmost cell in its row and the bottom cell in its column). Deleting c
produces a standard tableau of size n − 1; we will call the resulting partition λ − c. This construction gives
a collection of bijections
{T ∈ SYT(λ) : T (c) = n} → SYT(λ − c)
for each corner c.
200
We will prove by induction on n that f λ = F (λ). The base case n = 1 is clear. For the inductive step, we
wish to show that
X X F (λ − c)
F (λ) = F (λ − c) or equivalently =1 (9.38)
corners c corners c
F (λ)
since by the inductive hypothesis together with the bijections just described, the right-hand side of this
equation equals f λ .
Let c = (x, y) be a corner cell. Removing c decreases by 1 the sizes of the hooks H(c0 ) for cells c0 strictly
north or west of c, and leaves all other hook sizes unchanged. Therefore,
x−1 y−1
F (λ − c) (n − 1)! Y h(i, y) Y h(x, j)
=
F (λ) n! i=1
h(i, y) − 1 j=1
h(x, j) − 1
x−1 y−1
Y
1 Y 1 1
= 1+ 1+
n i=1 h(i, y) − 1 j=1 h(x, j) − 1
!
1 X Y 1 Y 1
= . (9.39)
n h(i, y) − 1 h(x, j) − 1
A⊆[x−1] i∈A j∈B
B⊆[y−1]
Consider the following random process (called a hook walk). First choose a cell (a0 , b0 ) uniformly from λ.
Then for each t = 1, 2, . . . , move to a cell (at , bt ) chosen uniformly from all other cells in H(at−1 , bt−1 ). The
process
P stops when it reaches a corner; let pc be the probability of reaching a particular corner c. Evidently
c pc = 1. Our goal now becomes to show that
F (λ − c)
pc = (9.40)
F (λ)
Consider a hook walk starting at (a, b) = (a1 , b1 ) and ending at (am , bm ) = (x, y). Let A = {a1 , . . . , am }
and B = {b1 , . . . , bm } be the sets of rows and columns encountered (removing duplicates); call these sets the
horizontal and vertical projections of W . Let
p(A, B a, b)
denote the probability that a hook walk starting at (a, b) has projections A and B. We claim that
Y 1 Y 1
p(A, B a, b) = . (9.41)
h(i, y) − 1 h(x, j) − 1
i∈A\x j∈B\y
| {z }
Φ
We prove this by induction on m. If m = 1, then either A = {a} = {x} and B = {b} = {y}, and the equation
201
reduces to 1 = 1 (the RHS is the empty product), or else it reduces to 0 = 0. If m > 1, then
p(A \ a1 , B a2 , b1 ) p(A, B \ b1 a1 , b2 )
p(A, B a, b) = +
h(a, b) − 1 h(a, b) − 1
| {z } | {z }
first move south to (a2 , b1 ) first move east to (a1 , b2 )
1
= (h(a, y) − 1)Φ + (h(x, b) − 1)Φ (by induction)
h(a, b) − 1
h(a, y) − 1 + h(x, b) − 1
= Φ. (9.42)
h(a, b) − 1
To see that the parenthesized expression in (9.42) is 1, consider the following diagram, with the hooks at
(a, y) and (x, b) shaded in red and blue respectively, with the corner (x, y) omitted so that there are a total
of h(a, y) − 1 + h(x, b) − 1 shaded cells. Pushing some red cells north and some blue cells to the left produces
the hook at (a, b) with one cell omitted, as on the right.
(a, b)
(x, b)
(a, y)
(x, y)
This proves (9.41). Now we compute pc , the probability that a walk ends at a particular corner c = (x, y).
Equivalently, x ∈ A and y ∈ B; equivalently, A ⊆ [x] and B ⊆ [y]. Therefore, summing over all possible
starting positions, we have
1 X
pc = p(A, B a, b)
n
(A,B,a,b):
A⊆[x], B⊆[y]
a=min A, b=min B
x=max A, y=max B
1 X Y 1 Y 1
= (by (9.41))
n h(i, y) − 1 h(x, j) − 1
(A,B,a,b) i∈A\x j∈B\y
as above
!
1 X Y 1 Y 1
=
n h(i, y) − 1 h(x, j) − 1
A⊆[x−1] i∈A j∈B
B⊆[y−1]
which is precisely (9.39). This establishes that pc = F (λ − c)/F (λ) (9.40) and completes the proof.
UNDER CONSTRUCTION
202
Recall that the Littlewood-Richardson coefficients cλµν are the structure coefficients for Λ as an algebra with
vector space basis {sλ : λ ∈ Par}: that is,
X
sµ sν = cλµν sλ .
λ
Proof. Consider column-strict tableaux of shape λ with labels taken from the alphabet 1 < 2 < · · · < 10 <
20 < · · · , and let the weight of such a tableau T be xα yβ , where αi (resp., βi ) is the number of cells filled
with i (resp., i0 ). Then the left-hand side is the generating function for all schools tableaux by weight. On
the other hand, such a tableau consists of a CST of shape µ filled with 1, 2, . . . (for some µ ⊆ λ) together
with a CST of shape λ/µ filled with 10 , 20 , . . . , so the RHS enumerates the same set of tableaux.
Theorem 9.16.2. For all partitions λ, µ, ν, we have
c̃λ/µ,ν = cλµ,ν = cλν,µ .
Equivalently,
hsµ sν , sλ iΛ = hsν , sλ/µ iΛ .
Proof. We need three countably infinite sets of variables x, y, z for this. Consider the “double Cauchy kernel”
Y Y
Ω(x, z)Ω(y, z) = (1 − xi zj )−1 (1 − yi zj )−1 .
i,j i,j
On the one hand, expanding both factors in terms of Schur functions and then applying the definition of the
Littlewood-Richardson coefficients to the z terms gives
! !
X X X
Ω(x, z)Ω(y, z) = sµ (x)sµ (z) sν (y)sν (z) = sµ (x)sν (y)sµ (z)sν (z)
µ ν µ,ν
X X
= sµ (x)sν (y) cλµ,ν sλ (z). (9.43)
µ,ν λ
203
(The first equality is perhaps clearer in reverse; think about how to express the right-hand side as an infinite
product over the variable sets x ∪ y and z. The second equality uses Proposition 9.16.1.) Now the theorem
follows from the equality of (9.43) and (9.44).
There are a lot of combinatorial interpretations of the Littlewood-Richardson numbers. Here is one. A
ballot sequence (or Yamanouchi word, or lattice permutation) is a sequence of positive integers such
that each initial subsequence contains at least as many 1’s as 2’s, at least as many 2’s as 3’s, et cetera.
Theorem 9.16.3 (Littlewood-Richardson Rule). cλµ,ν equals the number of column-strict tableaux T of shape
λ/µ, and content ν such that the word obtained by reading the entries of T row by row, right to left, top to
bottom, is a ballot sequence.
Include a proof. There are a lot of them but they tend to be hard.
Important special cases are the Pieri rules, which describe how to multiply by the Schur function corre-
sponding to a single row or column (i.e., by an h or an e.)
Theorem 9.16.4 (Pieri Rules). Let (k) denote the partition with a single row of length k, and let (1k )
denote the partition with a single column of length k. Then
X
sµ s(k) = sµ hk = sλ
λ
where λ ranges over all partitions obtained from µ by adding k boxes, no more than one in each column; and
X
sµ s(1k ) = sµ ek = sλ
λ
where λ ranges over all partitions obtained from µ by adding k boxes, no more than one in each row.
where λ ranges over all partitions obtained from µ by adding a single box. Via the Frobenius characteristic,
this gives a “branching rule” for how the restriction of an irreducible character of Sn splits into a sum of
irreducibles when restricted:
ResSS
n
(χλ ) = ⊕µ χµ
n−1
where now µ ranges over all partitions obtained from λ by deleting a single box. Details?
Definition 9.17.1. Let b, b0 be finite ordered lists of positive integers (or “words in the alphabet N>0 ”).
We say that b, b0 are Knuth equivalent, written b ∼ b0 , if one can be obtained from the other by a
sequence of transpositions as follows:
(Here the notation · · · xzy · · · means a word that contains the letters x, z, y consecutively.)
204
For example, 21221312 ∼ 21223112 by Rule 1, and 21223112 ∼ 21221312 by Rule 2 (applied in reverse).
This definition looks completely unmotivated at first, but hold that thought!
We now define an equivalence relation on column-strict skew tableaux, called jeu de taquin7 . The rule is
as follows:
• y x≤y
−−−−→ x y • y x>y
−−−−→ y •
x • x x
That is, for each inner corner of T — that is, an empty cell that has numbers to the south and east, say x
and y — then we can either slide x north into the empty cell (if x ≤ y) or slide y west into the empty cell
(if x > y). It is not hard to see that any such slide (hence, any sequence of slides) preserves the property of
column-strictness.
For example, the following is a sequence of jeu de taquin moves. The bullets • denote the inner corner that
is being slid into.
• 1 4 → 1 1 4 → 1 1 4 → 1 1 4 → 1 1 4
1 2 • 2 2 • • 2 4 2 2 4
2 3 4 2 3 4 2 3 4 2 3 • 3
(9.45)
→ • 1 1 4 → 1 • 1 4 → 1 1 • 4 → 1 1 4 4
2 2 4 2 2 4 2 2 4 2 2
3 3 3 3
If two skew tableaux T, T 0 can be obtained from each other by such slides (or by their reverses), we say that
they are jeu de taquin equivalent, denoted T ≈ T 0 . Note that any skew column-strict tableau T is jeu de
taquin equivalent to an ordinary CST (called the rectification of T ); see, e.g., the example (9.45) above.
In fact, the rectification is unique; the order in which we choose inner corners does not matter.
Definition 9.17.2. Let T be a column-strict skew tableau. The row-reading word of T , denoted row(T ),
is obtained by reading the rows left to right, bottom to top.
For example, the reading words of the skew tableaux in (9.45) are
If T is an ordinary (not skew) tableau, then it is determined by its row-reading word, since the “line breaks”
occur exactly at the strict decreases of row(T ). For skew tableaux, this is not the case. Note that some of
the slides in (9.45) do not change the row reading word; as a simpler example, the following skew tableaux
both have reading word 122:
1 2 2 2 2 2
1 2 1
On the other hand, it’s not hard to se that rectifying the second or third tableau will yield the first; therefore,
they are all jeu de taquin equivalent.
For a word b on the alphabet N>0 , let P(b) denote its insertion tableau under the RSK algorithm. (That is,
construct a generalized permutation bq in which q is any word; run RSK; and remember only the tableau
P , so that the choice of q does not matter.)
7 French for “sliding game”, roughly; it refers to the 15-square puzzle with sliding tiles that used to come standard on every
205
Theorem 9.17.3. (Knuth–Schützenberger) For two words b, b0 , the following are equivalent:
1. P (b) = P (b0 ).
2. b ∼ b0 .
3. T ≈ T 0 , for any (or all) column-strict skew tableaux T, T 0 with row-reading words b, b0 respectively.
This is sometimes referred to (e.g., in [Ful97]) as the equivalence of “bumping” (the RSK algorithm as
presented in Section 9.10) and “sliding” (jeu de taquin).
Fix w ∈ Sn . Start by drawing an n × n grid, numbering columns west to east and rows south to north. For
each i, place an X in the i-th column and wi -th row. We are now going to label each of the (n + 1) × (n + 1)
intersections of the grid lines with a partition, such that the partitions either stay the same or get bigger as
we move north and east. We start by labeling each intersection on the west and south sides with the empty
partition ∅.
×
×
×
×
×
×
×
×
For each box whose SW, SE and NW corners have been labeled λ, µ, ν respectively, label the NE corner ρ
according to the following rules:
Rule 2: If λ ( µ = ν and the box doesn’t contain an X, then it must be the case that µi = λi + 1 for some i.
Obtain ρ from µ by incrementing µi+1 .
Rule 3: If µ 6= ν, then set ρ = µ∨ν (where ∨ means the join in Young’s lattice: i.e., take the componentwise
maximum of the elements of µ and ν).
Rule X: If there is an X in the box, then it must be the case that λ = µ = ν. Obtain ρ from λ by incre-
menting λ1 .
Note that the underlined assertions need to be proved; this can be done by induction.
Example 9.18.1. Let n = 8 and w = 57214836. In Example 9.10.2, we found that RSK(w) = (P, Q), where
206
P = 1 3 6 and Q= 1 2 6 .
2 4 8 3 5 8
5 7 4 7
The following extremely impressive figure shows what happens when we run the alternate RSK algorithm
on w. The partitions λ are shown in red. The numbers in parentheses indicate which rules were used.
Observe that:
• Rule 1 is used exactly in those squares that have no X either due west or due south.
• For all squares s, |ρ| is the number of X’s in the rectangle whose northeast corner is s. In particular,
the easternmost partition λ(k) in the kth row, and the northernmost partition µ(k) in the kth column,
both have size k.
• It follows that the sequences
∅ = λ(0) ⊆ λ(1) ⊆ · · · ⊆ λ(n) ,
∅ = µ(0) ⊆ µ(1) ⊆ · · · ⊆ µ(n)
correspond to SYT’s of the same shape (in this case 332).
• These SYT’s are the P and Q of the RSK correspondence!
207
9.19 Quasisymmetric functions
Definition 9.19.1. A quasisymmetric function is a formal power series F ∈ C[[x1 , x2 , . . . ]] with the
following property: if i1 < · · · < ir and j1 < · · · < jr are two sets of indices in strictly increasing order and
α1 , . . . , αr ∈ N, then
[xα αr α1 αr
i1 · · · xir ]F = [xj1 · · · xjr ]F
1
Symmetric functions are automatically quasisymmetric, but not vice versa. For example,
X
x2i xj
i<j
is quasisymmetric but not symmetric (in fact, it is not preserved by any permutation of the variables). On
the other hand, the set of quasisymmetric functions forms a graded ring QSym ⊆ C[[x]]. We now describe
a vector space basis for QSym.
A composition α is a sequence (α1 , . . . , αr ) of positive integers, called its parts. Unlike a partition, we do
not require that the parts be in weakly decreasing order. If α1 + · · · + αr = n, we write α |= n; the set of all
compositions of n will be denoted Comp(n). Sorting the parts of a composition in decreasing order produces
a partition of n, denoted by λ(α).
Compositions are much easier to count than partitions. Consider the set of partial sums
The map α 7→ S(α) is a bijection from compositions of n to subsets of [n−1]; in particular, | Comp(n)| = 2n−1 .
We can define a partial order on Comp(n) via S by setting α β if S(α) ⊆ S(β); this is called refinement.
The covering relations are merging two adjacent parts into one part.
i1 <···<ir
Just as for the monomial symmetric functions, every monomial appears in exactly one Mα , and Defini-
tion 9.19.1 says precisely that a power series f is quasisymmetric if all monomials appearing in the same Mα
have the same coefficient in f . Therefore, the set {Mα } is a graded basis for QSym.
Example 9.19.2. Let M be a matroid on ground set E of size n. Consider weight functions f : E → N>0 ;
one of the definitions of a matroid (see the problem set) is that a smallest-weight basis of M can be chosen
via the following greedy algorithm (list E in weakly increasing order by weight e1 , . . . , en ; initialize B = ∅;
for i = 1, . . . , n, if B + ei is independent, then replace B with B + ei ). The Billera-Jia-Reiner invariant
of M is the formal power series X
W (M) = xf (1) xf (2) · · · xf (n)
f
208
where the sum runs over all weight functions f for which there is a unique smallest-weight basis. The
correctness of the greedy algorithm implies that W (M) is quasisymmetric.
For example, let E = {e1 , e2 , e3 } and M = U2 (3). The bases are e1 e2 , e1 e3 , and e2 e3 . Then E has a unique
smallest-weight basis iff f has a unique maximum; it doesn’t matter if the two smaller weights are equal or
not. If the weights are all distinct then they can be assigned to E in 3! = 6 ways; if the two smaller weights
are equal then there are three choices for the heaviest element of E. Thus
X X
W (U2 (3)) = 6xi xj xk + 3xi x2j = 6M111 + 3M12 .
i<j<k i<j
9.20 Exercises
(a) For w ∈ Sn , let (P (w), Q(w)) be the pair of tableaux produced by the RSK algorithm from w. Denote
by w∗ the reversal of w in one-line notation (for instance, if w = 57214836 then w∗ = 63841275). Prove
that P (w∗ ) = P (w)T (where T means transpose).
(b) (Open problem) For which permutations does Q(w∗ ) = Q(w)? Computation indicates that the number
of such permutations is (n−1)/2
2 (n − 1)!
if n is odd,
((n − 1)/2)!2
0 if n is even,
209
Exercise 9.9. Let G = (V, E) be a finite simple graph with vertex set V . Let C(G) denote the set of
proper colorings of G: functions κ : V → N>0 such that κ(v) 6= κ(w) whenever v, w are adjacent in G.
Define a formal power series in indeterminates x1 , x2 , . . . , by
X Y
XG = xκ(v) .
κ∈C(G) v∈V
| {z }
xκ
(a) Show that XG is a symmetric function (this is not too hard). It is known as the chromatic symmetric
function, and was introduced by Stanley [Sta95];
(b) Determine XG for (i) Kn ; (ii) Kn (i.e., the graph with n vertices and no edges); (iii) the four simple
graphs on 3 vertices; (iv) the two trees on 4 vertices.
(c) Explain how to recover the chromatic polynomial pG (k) (see Example 2.3.5) from XG . Does pG (k)
determine XG ?
(d) For a set A ⊆ E, let λ(A) denote the partition whose parts are the sizes of the components of the
subgraph G|A induced by A (so λ ` |V (G)| and `(λ) is the number of components). Prove [Sta95,
Thm. 2.5] that the expansion of XG in the power-sum basis is
X
XG = (−1)|A| pλ(A) .
A⊆E
210
Chapter 10
For many combinatorial structures, there is a natural way of taking apart one object into two, or combining
two objects into one.
• Let G = (V, E) be a (simple, undirected) graph. For any W ⊆ V , we can break G into the two pieces
G|W and G|V \W . On the other hand, given two graphs, we can form their disjoint union G ∪· H.
• Let M be a matroid on ground set E. For any A ⊆ E, we can break M into the restriction M |A
(equivalently, the deletion of E \ A) and the contraction M/A. Two matroids can be combined into
one by taking the direct sum.
• Let P be a ranked poset. For any x ∈ P , we can extract the intervals [0̂, x] and [x, 1̂]. (Of course, we
don’t get every element of the poset this way.) Meanwhile, two graded posets P, Q can be combined
into one poset in many ways, such as Cartesian product (see Definition 1.1.12).
• Let α = (α1 , . . . , α` ) |= n. For 0 ≤ k ≤ `, we can break α up into two sub-compositions α(k) =
(α1 , . . . , αk ), α(k) = (αk+1 , . . . , α` ). Of course, two compositions can be combined by concatenating
them.
In all these operations, there are lots of ways to split, but only one way to combine. Moreover, all the
operations are graded with respect to natural size functions on the objects: for instance, matroid direct sum
is additive on size of ground set and on rank.
Splitting Combining
|V (G|W )| + |V (G|V \W )| = |V (G)| |V (G ∪· H)| = |V (G)| + |V (H)|
|E(M |A )| + |E(M/A)| = |E(M ) |E(M ⊕ M 0 )| = |E(M )| + E(M 0 )|
r([0̂, x]) + r([x, 1̂]) = r(P ) r(P ⊕ Q) = r(P ) + r(Q)
|α(k) | + |α(k) | = |α| r(αβ) = r(α) + r(β)
A Hopf algebra is a vector space H (over C, say) with two additional operations, a product µ : H⊗H → H
(which represents combining) and a coproduct ∆ : H → H⊗H which represents splitting. These operations
211
are respectively associative and coassociative, and they are compatible in a certain way. Technically, all
this data defines the slightly weaker structure of a bialgebra; a Hopf algebra is a bialgebra with an additional
map S : H → H, called the antipode. Most bialgebras that arise in combinatorics have a unique antipode
and thus a unique Hopf structure.
What is a C-algebra? It is a C-vector space A equipped with a ring structure. Its multiplication can be
thought of as a C-bilinear map
µ:A⊗A→A
that is associative, i.e., µ(µ(a, b), c) = µ(a, µ(b, c)). Associativity can be expressed as the commutativity of
the diagram
µ⊗Id
A⊗A⊗A A⊗A a⊗b⊗c ab ⊗ c
Id ⊗µ µ (10.1)
µ
A⊗A A a ⊗ bc abc
where I denotes the identity map. (Diagrams like this rely on the reader to interpret notation such as µ ⊗ I
as the only thing it could be possibly be; in this case, “apply µ to the first two tensor factors and tensor
what you get with [I applied to] the third tensor factor”.)
What then is a C-coalgebra? It is a C-vector space Z equipped with a C-linear comultiplication map
∆:Z →Z ⊗Z
that is coassociative, a condition defined by reversing the arrows in the previous diagram:
∆⊗Id
Z ⊗Z ⊗Z Z ⊗Z
Id ⊗∆ ∆ (10.2)
∆
Z ⊗Z Z
Just as an algebra has a unit, a coalgebra has a counit. To say what this is, let us diagramify the defining
property of the multiplicative unit 1A in an algebra A: it is the image of 1C under a map u : C → A such
that the diagram on the left commutes (where the top diagonal maps take a ∈ A to 1 ⊗ a or a ⊗ 1). Thus
a counit of a coalgebra is a map ε : Z → C such that the diagram on the right commutes (where the top
diagonal maps are projections).
A Z
u⊗Id Id ⊗u ε⊗Id Id ⊗ε
A⊗A Z ⊗Z
(10.3)
A bialgebra is a vector space B that has both a multiplication and a comultiplication, and such that
multiplication is a coalgebra morphism and comultiplication is an algebra morphism. Both of these conditions
212
are expressible by commutativity of the diagram
∆⊗∆
B⊗B B⊗B⊗B⊗B
µ µ13 ⊗µ24 (10.4)
∆
B B⊗B
where µ13 ⊗ µ24 means the map that sends a ⊗ b ⊗ c ⊗ d to ac ⊗ bd (the subscripts refer to the positions of
the tensor factors).
Comultiplication takes some getting used to. As explained above, in combinatorial settings, one should
generally think of multiplication as putting two objects together, and comultiplication as taking an object
apart into two subobjects. A unit is a trivial object (putting it together with another object has no effect),
and the counit is the linear functional that picks off the coefficient of the unit.
Example 10.1.1 (The polynomial Hopf algebra). A simple example of a Hopf algebra is the polynomial
ring C[x]. It is an algebra in the usual way, and can be made into a coalgebra by the counit ε(f (x)) = f (0)
(equivalently, mapping every polynomial to its constant term) and the coproduct ∆(x) = 1 ⊗ x + x ⊗ 1.
Checking the bialgebra axioms is left as an exercise. J
Example 10.1.2 (The graph Hopf algebra). For n ≥ 0, let Gn be the set of formal C-linear combinations of
unlabeled simple graphs on n vertices (or ifLyou prefer, of isomorphism classes [G] of simple graphs G, but it
is easier to drop the brackets), and let G = n≥0 Gn . Thus G is a graded vector space, which we make into a
C-algebra by defining µ(G⊗H) = G∪· H, where ∪· denotes union under the assumption V (G)∩V (H) = ∅.The
unit is the unique graph K0 with no vertices (or, technically, the map u : C → G0 sending c ∈ C to cK0 ).
Comultiplication in G is defined by
X
∆(G) = G|A ⊗ G|B .
·B
A,B: V (G)=A∪
As an illustration of how the compatibility condition (10.4) works, we’ll check it for G. To avoid “overfull
hbox” errors, set µ̃ = µ13 ⊗ µ24 . Then
X X
µ̃(∆ ⊗ ∆(G1 ⊗ G2 )) = µ̃ G1 |A1 ⊗ G1 |B1 ⊗ G2 |A2 ⊗ G2 |B2
A1 ∪
· B1 =V (G1 ) A2 ∪
· B2 =V (G2 )
X
= µ̃
G1 |A1 ⊗ G1 |B1 ⊗ G2 |A2 ⊗ G2 |B2
A1 ∪
· B1 =V (G1 )
A2 ∪
· B2 =V (G2 )
X
= (G1 |A1 ∪· G2 |A2 ) ⊗ (G1 |B1 ∪· G2 |B2 )
A1 ∪
· B1 =V (G1 )
A2 ∪
· B2 =V (G2 )
X
= (G1 ∪· G2 )|A ⊗ (G1 ∪· G2 )|B
· B=V (G1 ∪
A∪ · G2 )
= ∆(µ(G1 ⊗ G2 )).
213
expressed by the diagrams
sw sw
B⊗B B⊗B B⊗B B⊗B
µ µ ∆ ∆
B B
So cocommutativity means that ∆(G) is symmetric under switching; for the graph algebra this is clear
because A and B are interchangeable in the definition. J
Example 10.1.3 (Rota’s Hopf algebra of posets). For n ≥ 0, let Pn be the vector space of formal C-linear
combinations of isomorphism classes [P ] of finite graded posets P of rank n. Thus P0 and P1 are L one-
dimensional (generated by the chains of lengths 0 and 1), but dim Pn = ∞ for n ≥ 2. We make P = n Pn
into a graded C-algebra by defining µ([P ]⊗[Q]) = [P ×Q], where × denotes Cartesian product; thus u(1) = •.
Comultiplication is defined by X
∆[P ] = [0̂, x] ⊗ [x, 1̂].
x∈P
Coassociativity is checked by the following calculation, which should remind you of the proof of associativity
of convolution in the incidence algebra of a poset (Prop. 2.1.2):
!
X
∆ ⊗ I(∆(P )) = ∆ ⊗ I [0̂, x] ⊗ [x, 1̂]
x∈P
X
= ∆([0̂, x]) ⊗ [x, 1̂]
x∈P
X X
= [0̂, y] ⊗ [y, x] ⊗ [x, 1̂]
x∈P y∈[0̂,x]
X
= [0̂, y] ⊗ [y, x] ⊗ [x, 1̂]
x≤y∈P
X X
= [0̂, y] ⊗ [y, x] ⊗ [x, 1̂]
y∈P x∈[y,1̂]
X
= [0̂, y] ⊗ ∆([y, 1̂]) = I ⊗ ∆(∆(P )).
y∈P
This Hopf algebra is commutative, but not cocommutative; the switching map does not fix ∆(P ) unless P
is self-dual. J
Example 10.1.4 (The Hopf algebra of matroids). For n ≥ 0, let Mn be the vector space of formal C-
L [M ] of finite matroids M on n elements. Here dim P0 = 1 and
linear combinations of isomorphism classes
dim Pn < ∞ for every n. We make M = n Mn into a graded C-algebra by defining µ([P ] ⊗ [Q]) = [P ⊗ Q].
The trivial matroid (with empty ground set) is the multiplicative identity. Note that multiplication is
commutative. Letting E denote the ground set of M , we define comultiplication by
X
∆[M ] = M |A ⊗ M/A.
A⊆E
Coassociativity is essentially a consequence of the compatibility of deletion and contraction (Prop. 3.8.2).
Note that the coproduct is not cocommutative. J
214
This is a good place to introduce what is known as Sweedler notation. Often, it is highly awkward to
notate all the summands in a coproduct, particularly if we are trying to prove general facts about Hopf
algebra. The Sweedler notation for a coproduct is
X
∆(h) = h1 ⊗ h2
which should be read as “the coproduct of h is a sum of a bunch of tensors, each of which has a first element
and a second element.” This notation looks dreadfully abusive at first, but in fact it is incredibly convenient,
is unambiguous if used properly, and one soon discovers that any other way of doing things would be worse
(imagine having to conjure an index set out of thin air and deal with a lot of double subscripts just to write
down a coproduct). Sweedler notation iterates well; for example, we could write
X
∆2 (h) = (Id ⊗∆)(∆(h)) = (∆ ⊗ Id)(∆(h)) = h1 ⊗ h2 ⊗ h3
Example 10.1.5. The ring Λ of symmetric functions is a coalgebra in the following way. Recall that Λ
is a subring of the ring of formal power series C[[x]] = C[[x1 , x2 , . . . ]]. First, the counit is just the map
that takes a formal power series to its constant term. To define the coproduct of F ∈ Λ, we first apply
the “Hilbert Hotel substitution”: replace x1 , x2 , x3 , x4 , . . . with x1 , y1 , x2 , y2 , . . . to obtain a power series
F (x, y) ∈ C[[x, y]] = C[[x]] ⊗ C[[y]]. This power series is symmetric in each of the variable sets x and y, i.e.,
For example, clearly ∆(c) = c = c ⊗ 1 = 1 ⊗ c for any scalar c. Moreover, for every k, we have
k
X k
X
hk (x, y) = hj (x)hk−j (y), ek (x, y) = ej (x)ek−j (y)
j=0 j=0
and therefore
k
X k
X
∆(hk ) = hj ⊗ hk−j , ∆(ek ) = ej ⊗ ek−j .
j=0 j=0
Definition 10.1.6. A Hopf algebra is a bialgebra H with a antipode S : H → H, which satisfies the
commutative diagram
S⊗Id
H⊗H H⊗H
∆ µ
H H (10.5)
ε u
P
In other words, to calculate the antipode of something, comultiply Pit to get ∆g = g1 ⊗ g2 . Now hit every
first tensor factor with S and then multiply it out again to obtain S(g1 ) · g2 . If you started with the unit
215
then this should be 1, while if you started with any other homogeneous object then you get 0. This enables
calculating the antipode recursively. For example, in QSym:
Lemma 10.1.7 (Humpert, Prop 1.4.4). Let B be a bialgebra that is graded and connected, i.e., the 0th
graded piece has dimension 1 as a vector space. Let n > 0 and let h ∈ Hn . Then
X
∆(h) = h ⊗ 1 + h1 ⊗ h2 + 1 ⊗ h
where the Sweedler-notation sum contains only elements of degrees strictly between 0 and n.
Proof. Refer to the diagrams for the unit and counit (10.3). In particular, the right-hand triangle gives
h1 ⊗ ε(h2 ) = h. So certainly one of the summands must have h1 ∈ Hn , but then h2 ∈ H0 . Since H0 ∼=C
P
we may as well group all those summands together; they must sum to h ⊗ 1. Meanwhile, the left-hand
triangle says that grouping together all the summands of bidegree 0, n gives 1 ⊗ h.
Proposition 10.1.8. Let B be a connected and graded bialgebra. Then the commutative diagram (10.5)
defines a unique antipode S : B → B, and thus B can be made into a Hopf algebra in a unique way.
Combinatorics features lots of graded connected bialgebras (such as all those we have seen so far), so this
proposition gives us a Hopf algebra structure “for free”.
There is a general recipe for the antipode, known as Takeuchi’s formula [Tak71]. Let π : H → H be the map
that kills H0 and fixes each positive graded piece pointwise. Then
X
S = uε + (−1)k µk−1 π ⊗k ∆k−1 , (10.6)
k≥1
i.e., X X
S(h) = u(ε(h)) − π(h) + π(h1 )π(h2 ) − π(h1 )π(h2 )π(h3 ) + · · ·
However, there is a lot of cancellation in this sum, making it impractical for looking at specific Hopf algebras.
Therefore, one of the first things one wants in studying a particular Hopf algebra is to find a cleaner formula
for the antipode. An excellent example is the Hopf algebra of symmetric functions: our calculation of ∆(hk )
says that
k
(
X 1 if k = 0,
S(hj )hk−j =
j=0
0 if k > 0
and comparing with the Jacobi-Trudi relations (see §9.5) we see that S(hk ) = (−1)k ek , i.e., S = (−1)k ω.
216
10.2 Characters
A character on a Hopf algebra H is a C-linear map ζ : H → C that is multiplicative, i.e., ζ(1H ) = 1C and
ζ(h · h0 ) = ζ(h)ζ(h0 ). For example, if H is the graph Hopf algebra, then we can define a character by
(
1 if G has no edges,
ζ(G) = (10.7)
0 if G has one or more edges,
for a graph G, and then extending by linearity to all of G. This map is multiplicative (because G · H has an
edge iff either G or H does); it also looks kind of like a silly map. However, the reason this is interesting is
that characters can be multiplied
P together. The multiplication is called convolution product, defined as
follows: if h ∈ H and ∆(h) = h1 ⊗ h2 in Sweedler notation, then
X
(ζ ∗ η)(h) = ζ(h1 )η(h2 ).
One can check that convolution is associative; the calculation resembles checking that the incidence algebra
of a poset is an algebra. The counit ε is a two-sided identity for convolution, i.e., ζ ∗ ε = ε ∗ ζ = ζ for all
characters ζ. Moreover, the definition (10.5) of the antipode implies that
ζ ∗ (ζ ◦ S) = ε
(check this too). Therefore, the set of all characters forms a group.
Why would you want to convolve characters? Consider the graph Hopf algebra with the character ζ, and let
k ∈ N. The kth convolution power of ζ is given by
X
ζ k (G) = ζ ∗ · · · ∗ ζ (G) = ζ(G|V1 ) · · · ζ(G|Vk )
| {z }
k times V (G)=V1 ∪
· ···∪
· Vk
(
X 1 if V1 , . . . , Vk are all cocliques,
=
V (G)=V ∪· ···∪
·V
0 otherwise.
1 k
(recall that a coclique is a set of vertices of which no two are adjacent). In other words, ζ n (G) counts
the number of functions f : V → [k] so that f (x) 6= f (y) whenever x, y are adjacent. But such a thing is
precisely a proper k-coloring! I.e.,
ζ n (G) = p(G; k)
where p is the chromatic polynomial (see Section 4.4). This turns out to be true as a polynomial identity
in k — for instance, ζ −1 (G) is the number of acyclic orientations. One can even view the Tutte polynomial
k
T (G; x, y) as a character τx,y (G) with parameters x, y; it turns out that τx,y (G) is itself a Tutte polynomial
evaluation — see Brandon Humpert’s Ph.D. thesis [Hum11].
A combinatorial Hopf algebra, or CHA, is a pair (H, ζ), where H is a graded connected Hopf algebra
Φ
→ (H0 , ζ 0 ) that is an algebra and coalgebra
and ζ is a character. A morphism of CHA’s is a map (H, ζ) −
0 0
morphism and satisfies ζ ◦ Φ = Φ ◦ ζ .
Example 10.2.1. The binomial Hopf algebra is the ring of polynomials C[x], equipped with the coprod-
uct generated by ∆(x) = x ⊗ 1 + 1 ⊗ x. To justify the name, note that
n
X n
∆(xn ) = ∆(x)n = (x ⊗ 1 + 1 ⊗ x)n = xk ⊗ xn−k .
k
k=0
217
This is extended linearly, so that ∆(f (x)) = f (∆(x)) for any polynomial f . The counit is ε(f ) = f (0), and
the antipode is given by S(xk ) = (−1)k xk (check this). We make it into a CHA by endowing it with the
character ε1 (f ) = f (1).
Pζ : (H, ζ) → (C[x], ε1 )
For example, if H is the graph algebra and ζ the characteristic function of edgeless graphs (10.7), then Pζ is
the chromatic polynomial. J
Example 10.2.2. The ring QSym of quasisymmetric functions can be made into a Hopf algebra as follows.
Let α = (α1 , . . . , αk ) be a composition; then
k
X
∆Mα = M(α1 ,...,αj ) ⊗ M(αj+1 ,...,αk ) .
j=0
One can check (Exercise 10.2) that the Hopf algebra of symmetric functions described in Example 10.1.5 is
a Hopf subalgebra of QSym; that is, this coproduct restricts to the one defined earlier on Λ. We then endow
QSym with the character ζQ defined on the level of power series by ζQ (x1 ) = 1 and ζQ (xj ) = 0 for j ≥ 2;
equivalently, (
1 if α has at most one part,
ζQ (Mα ) =
0 otherwise.
One of the main theorems about CHAs, due to Aguiar, Bergeron and Sottile [ABS06], is that (QSym, ζQ )
is a terminal object in the category of CHAs, i.e., every CHA (H, ζ) admits a canonical morphism
to (QSym, ζ). For the graph algebra, this morphism is the chromatic symmetric function; for the matroid
algebra, it is the Billera-Jia-Reiner invariant. J
Hopf monoids are a more recent area of research. One exhaustive reference is the book by Aguiar and
Mahajan [AM10]; more accessible introductions (and the main sources for these notes) include Klivans’ talk
slides [Kli] and the preprint by Aguiar and Ardila [AA17]. One of the ideas behind Hopf monoids is to work
with labeled rather than unlabeled objects.
First, we need a set H[I] for every finite set I. One should think of H[I] as the vector space spanned by
combinatorial objects of a certain ilk, with I as the labeling set. (For example, graphs with vertices I,
matroids with ground set I, linear orderings of I, polyhedra in RI , etc.) Every bijection π : I → I 0 should
induce a linear isomorphism H[π] : H[I] → H[I 0 ], which should be thought of as relabeling, and the association
of H[π] with π is functorial2 . A functor H with these properties is called a vector species. Moreover, we
require that dim H[∅] = 1, and we identify a particular nonzero element of H[∅] as the “trivial object”.
2 This is a fancy way of saying that it obeys some completely natural identities: H[Id ] = Id
I H[I] and H[π ◦ σ] = H[π] ◦ H[σ].
Don’t worry too much about it.
218
Then, we need to have multiplication and comultiplication maps for every decomposition I = A ∪· B:
µA,B ∆A,B
H[A] ⊗ H[B] −−−→ H[I] and H[I] −−−→ H[A] ⊗ H[B]. (10.8)
These are subject to a whole lot of conditions. The most important of these are labeled versions of associa-
tivity, coassociativity, and compatibility:
µI,J ⊗IdK
H[I] ⊗ H[J] ⊗ H[K] H[I] ⊗ H[J ∪· K]
∆I,J ⊗IdK
H[I] ⊗ H[J] ⊗ H[K] H[I] ⊗ H[J ∪· K]
∆I,J ⊗∆K,L
H[I ∪· J] ⊗ H[K ∪· L] H[I] ⊗ H[J] ⊗ H[K] ⊗ H[L]
µI∪
· J,K∪
·L (µI,K ⊗µJ,L )◦τ (compatibility), (10.11)
∆I∪
· K,J∪
·L
H[I ∪· J ∪· K ∪· L] H[I ∪· K] ⊗ H[J ∪· L]
where τ interchanges the second and third tensor factors.
Note that instead of defining a single coproduct as the sum over all possible decompositions A, B (as in the
Hopf algebra setup), we are keeping the different decompositions separate.
In many cases, the operations can be defined on the level of individual combinatorial objects. In other
words, we start with a set species h — a collection of sets h[I] indexed by finite sets I, subject to the
conditions that any bijection I → I 0 naturally induces a bijection h[I] → h[I 0 ], define multiplication and
comultiplication operations
µA,B ∆A,B
h[A] × h[B] −−−→ h[I] and h[I] −−−→ h[A] × h[B]
(in contrast to eqrefvector-species-product-coproduct, these are Cartesian products of sets rather than tensor
products of vector spaces), then define a vector species H by setting H[I] = kh[I], and define multiplication
and comultiplication on H by linear extension. Such a Hopf monoid is called linearized. This is certainly
a very natural kind of Hopf monoid, but not all the Hopf monoids we care about come from a set species in
this way.
Example 10.3.1. Let ` [I] denote the set of linear orders on a finite set I, which we can think of as bijections
w : [n] → I (and represent by the sequence w(1), . . . , w(n)). Given a decomposition I = A ∪· B, the most
obvious way to define product and coproduct on the set species ` is by concatenation and restriction. For
instance, if A = {a, b, c} and B = {p, q, r, s}, then
Linearizing this setup produces the Hopf monoid of linear orders L = k``. J
219
Example 10.3.2. Let m[I] denote the set of matroids with ground set I, with product and coproduct
defined setwise by
µ(M1 , M2 ) = M1 ⊕ M2 , ∆A,B (M ) = (M |A , M/A).
The linearized Hopf monoid M = km is a labeled analogue of the matroid Hopf algebra M described in
Example 10.1.4. J
Multiplication and comultiplication can be iterated. For any set composition A (i.e., an ordered list A =
A1 | . . . |An whose disjoint union is I), there are maps
n n
O µA ∆A
O
H[Ai ] −−→ H[I] and H[I] −−→ H[Ai ]
i=1 i=1
that are well defined by associativity and coassociativity. (For set species, replace tensor products with
Cartesian products.) For example, if A = (I, J, K) then we can define µA by either traveling south then
east, or east then south, in (10.9) — we get the same answer in both cases.
The antipode in a Hopf monoid H is the following collection of maps SI : H[I] → H[I] given by the Takeuchi
formula: for x ∈ H[I], (
x if I = ∅,
S(x) = SI (x) = P n
(10.12)
A|=I (−1) µA (∆A (x)) if I 6= ∅.
Here A |= I means that A runs over all set compositions of I with nonempty parts (in particular, there are only
finitely many summands). As in the Hopf algebra setting, this formula typically has massive cancellation,
so in order to study a particular Hopf monoid it is desirable to find a cancellation-free formula.
Example 10.3.3. Let us calculate some antipodes in L. The trivial ordering on ∅ is trivially fixed by S,
while for a singleton set I = {a} we have S(a) = −a (the Takeuchi formula has only one term, corresponding
to the set partition of I with one block). For ab ∈ L[{a, b}] we have
S(ab) = −µ12 (∆12 (ab)) + µ1|2 (∆1|2 (ab)) + µ2|1 (∆2|1 (ab))
= −ab + (a)(b) + (b)(a)
= −ab + ab + ba = ba,
while for abc ∈ L[I] the antipode is calculated by the following table:
A |A| ∆A (I) (−1)|A| µA (∆A (I))
123 1 abc −abc
1, 23 2 a, bc +abc
2, 13 2 b, ac +bac
3, 12 2 c, ab +cab
12, 3 2 ab, c +abc
13, 2 2 ac, b +acb
23, 1 2 bc, a +bca
1, 2, 3 3 a, b, c −abc
1, 3, 2 3 a, c, b −acb
2, 1, 3 3 b, a, c −bac
2, 3, 1 3 b, c, a −bca
3, 1, 2 3 c, a, b −cab
3, 2, 1 3 c, b, a −cba
Total −cba
It is starting to look suspiciously as though SI = (−1)I rev, where rev denotes the map that reverses ordering.
In fact this is the case (proof left as an exercise). J
220
Material to be written: duality, L∗ , generalized permutahedra and the Aguilar-Ardila antipode calculation,
...
10.4 Exercises
Exercise 10.1. Confirm that the polynomial Hopf algebra (Example 10.1.1) satisfies (10.2) and (10.4), and
determine its antipode.
Exercise 10.2. Confirm that the symmetric functions Λ are a Hopf subalgebra of the quasi symmetric
functions QSym, as asserted in Example 10.2.2.
Exercise 10.3. Let E(M ) denote the ground set of a matroid M , and call |E(M )| the order of M . Let
Mn be the vector L space of formal C-linear combinations of isomorphism classes [M ] of matroids M of
order n. Let M = n≥0 Mn . Define a graded multiplication on M by [M ][M 0 ] = [M ⊕ M 0 ] and a graded
comultiplication by X
∆[M ] = [M |A ] ⊗ [M/A]
A⊆E(M )
where M |A and M/A denote restriction and contraction respectively. Check that these maps make M into
a graded bialgebra, and therefore into a Hopf algebra by Proposition 10.1.8.
Exercise 10.4. Prove that the Billera–Jia–Reiner invariant defines a Hopf algebra morphism M → QSym.
Exercise 10.5. Prove that the antipode in L is indeed given by SI = (−1)I rev, as in Example 10.3.3.
221
Chapter 11
More Topics
The main theorem of this section is the Max-Flow/Min-Cut Theorem of Ford and Fulkerson. Strictly speak-
ing, it probably belongs to graph theory or combinatorial optimization rather than algebraic combinatorics,
but it is a wonderful theorem and has applications to posets and algebraic graph theory, so I can’t resist
including it.
Definition 11.1.1. A network N consists of a directed graph (V, E), two distinct vertices s, t ∈ V (called
the source and sink respectively), and a capacity function c : E → R≥0 .
Throughout this section, we will fix the symbols V , E, s, t, and c for these purposes. We will assume that
the network has no edges into the source or out of the sink.
A network is supposed to model the flow of “stuff”—data, traffic, liquid, electrical current, etc.—from s to t.
The capacity of an edge is the maximum amount of stuff that can flow through it (or perhaps the amount of
stuff per unit time). This is a general model that can be specialized to describe cuts, connectivity, matchings
and other things in directed and undirected graphs. This interpretation is why we exclude edges into s or
out of t; we will see later why this assumption is in fact justified.
If c(e) ∈ N for all e ∈ E, we say the network is integral. In what follows, we will only consider integral
networks.
a 1 b
1 1
s 2
t
1 1
c 1
d
222
Definition 11.1.2. A flow on N is a function f : E → N that satisfies the capacity constraints
where X X
f − (v) = f (e), f + (v) = f (e).
e=−
→
uv e=−→
vw
The number f (e) represents the amount of stuff flowing through e. That amount is bounded by the capacity
of that edge, hence the constraints (11.1). Meanwhile, the conservation constraints say that stuff cannot
accumulate at any internal vertex of the network, nor can it appear out of nowhere.
The max-flow problem is to find a flow of maximum value. The dual problem is the min-cut problem,
which we now describe.
Definition 11.1.3. Let N be a network. Let S, T ⊆ V with S ∪ T = V , S ∩ T = ∅, s ∈ S, and t ∈ T . The
corresponding cut is
[S, T ] = {−
→ ∈ E : x ∈ S, y ∈ T }
xy
and the capacity of that cut is X
c(S, T ) = c(e).
e∈[S,T ]
A cut can be thought of as a bottleneck through which all stuff must pass. For example, in the network of
→
− − → → −
Figure 11.1, we could take S = {s, a, c}, T = {b, d, t}, so that [S, T ] = {ab, ad, cd}, and c(S, T ) = 1+2+1 = 4.
The min-cut problem is to find a cut of minimum capacity. This problem is certainly feasible, since there
are only finitely many cuts and each one has finite capacity.
X X
For A ⊆ V , define f − (A) = f (e), f + (A) = f (e).
e∈[Ā,A] e∈[A,Ā]
The proof (which requires little more than careful bookkeeping) is left as an exercise.
223
The inequality (11.3c) is known as weak duality; it says that the maximum value of a flow is less than or
equal to the minimum capacity of a cut. (Strong duality would say that equality holds.)
Suppose that there is a path P from s to t in which no edge is being used to its full capacity. Then we
can increase the flow along every edge on that path, and thereby increase the value of the flow by the same
amount. As a simple example, we could start with the zero flow f0 on the network of Figure 11.1 and increase
flow by 1 on each edge of the path sadt; see Figure 11.2.
a 10 b a 10 b
10 10 11 10
s 20 s 21
t t
10 10 10 11
c 10 c 10
d d
|f0 | = 0 |f1 | = 1
The problem is that there can exist flows that cannot be increased in this elementary way — but nonetheless
are not maximum. The flow f1 of Figure 11.2 is an example. In every path from s to t, there is some edge
e with f (e) = c(e). However, it easy to construct a flow of value 2:
a 11 b
11 11
s 20
t
11 11
c 11
d
|f2 | = 2
Figure 11.3: A better flow that cannot be obtained from f1 in the obvious way.
Fortunately, there is a more general way to increase the value of a flow. The key idea is that flow along an
edge −→ can be regarded as negative flow from y to x. Accordingly, all we need is a path from s to t in which
xy
each edge e is either pointed forward and has f (e) < c(e), or is pointed backward and has f (e) > 0. Then,
increasing flow on the forward edges and decreasing flow on the backward edges will increase the value of
the flow. This is called an augmenting path for f .
The Ford-Fulkerson Algorithm is a systematic way to construct a maximum flow by looking for aug-
menting paths. The wonderful feature of the algorithm is that if a flow f has no augmenting path, the
algorithm will automatically find a cut of capacity equal to |f | — thus certifying immediately that the flow
is maximum and that the cut is minimum.
224
a 10 b a 11 b
11 10 11 11
s 21 s 20
t t
10 11 11 11
c 10 c 11
d d
|f1 | = 1 |f2 | = 2
Figure 11.4: Exploiting the augmenting path scdabt for f1 . The flow is increased by 1 on each of the
“forward” edges sc, cd, ab, bt and decreased by 1 on the “backward” edge da to obtain the improved flow f2 .
P : x0 = s, e1 , x1 , e2 , x2 , . . . , xn−1 , en , xn = t
By integrality and induction, all tolerances are integers and all flows are integer-valued. In particular, each
iteration of the loop increases the value of the best known flow by 1. Since the value of every flow is bounded
by the minimum capacity of a cut (by weak duality), the algorithm is guaranteed to terminate in a finite
number of steps. (By the way, Step 1 of the algorithm can be accomplished efficiently by a slight modification
of, say, breadth-first search.)
The next step is to prove that this algorithm actually works. That is, when it terminates, it will have
computed a flow of maximum possible value.
Proposition 11.1.5. Suppose that f is a flow that has no augmenting path. Let
Then s ∈ S, t ∈ T , and c(S, T ) = |f |. In particular, f is a maximum flow and [S, T ] is a minimum cut.
Proof. Note that t 6∈ S precisely because f has no augmenting path. Applying (11.3b) gives
X X X
|f | = f + (S) − f − (S) = f (e) − f (e) = f (e).
e∈[S,S̄] e∈[S̄,S] e∈[S,S̄]
But f (e) = c(e) for every e ∈ [S, T ] (otherwise S would be bigger than what it actually is), so this last
quantity is just c(S, T ). The final assertion follows by weak duality.
We have proven:
225
Theorem 11.1.6 (Max-Flow/Min-Cut Theorem for Integral Networks (“MFMC”)). For every integral
network N , the maximum value of a flow equals the minimum value of a cut.
In light of this, we will call the optimum of both the max-flow and min-cut problems the value of N , written
|N |. In fact MFMC holds for non-integral networks as well, although the Ford-Fulkerson algorithm may not
work in that case (the flow value might converge to |N | without ever reaching it.)
Definition 11.1.7. Let N be a network. A flow f in N is acyclic if, for every directed cycle C in N (i.e.,
every set of edges x1 → x2 → · · · → xn → x1 ), there is some e ∈ C for which f (e) = 0. The flow f is
partitionable if there is a collection of s, t-paths P1 , . . . , P|f | such that for every e ∈ E,
f (e) = #{i : e ∈ Pi }.
(Here “s, t-path” means “path from s to t”.) In this sense f can be regarded as the “sum” of the paths Pi ,
each one contributing a unit of flow.
Proposition 11.1.8. Let N be a network. Then:
1. For every flow in N , there exists an acyclic flow with the same value. In particular, N admits an
acyclic flow with |f | = |N |.
2. Every acyclic integral flow is partitionable.
Proof. Suppose that some directed cycle C has positive flow on every edge. Let k = min{f (e) : e ∈ C}.
Define f˜ : E → N by (
f (e) − k if e ∈ C,
f˜(e) =
f (e) if e 6∈ C.
Given a nonzero acyclic flow f , find an s, t-path P1 along which all flow is positive. Decrement the flow
on each edge of P1 ; doing this will also decrement |f |. Now repeat this for an s, t-path P2 , etc. When the
resulting flow is zero, we will have partitioned f into a collection of s, t-paths of cardinality |f |.
Remark 11.1.9. This discussion justifies our earlier assumption that there are no edges into the source or
out of the sink, since every acyclic flow must be zero on all such edges. Therefore, deleting those edges from
a network does not change the value of its maximum flow.
This result has many applications in graph theory: Menger’s theorems, the König-Egerváry theorem, etc.
The basic result in this area is Dilworth’s Theorem, which resembles the Max-Flow/Min-Cut Theorem (and
can indeed be derived from it; see the exercises).
Definition 11.2.1. A chain cover of a poset P is a collection of chains whose union is P . The minimum
size of a chain cover is called the width of P .
226
Theorem 11.2.2 (Dilworth’s Theorem). Let P be a finite poset. Then
width(P ) = m(P ).
Proof. The “≥” direction is clear, because if A is an antichain, then no chain can meet A more than once,
so P cannot be covered by fewer than |A| chains.
For the more difficult “≤” direction, we induct on n = |P |. The result is trivial if n = 1 or n = 2.
Let Y be the set of all minimal elements of P , and let Z be the set of all maximal elements. Note that Y
and Z are both antichains. First, suppose that no set other than Y or Z is a maximum1 antichain; dualizing
if necessary, we may assume |Y | = m(P ). Let y ∈ Y and z ∈ Z with y ≤ z. Let P 0 = P \ {y, z}’; then
m(P 0 ) = |Y | − 1. By induction, width(P 0 ) ≤ |Y | − 1, and taking a chain cover of P 0 and tossing in the chain
{y, z} gives a chain cover of P of size |Y |.
Then
• P + , P − 6= A (otherwise A equals Z or Y ).
• P + ∪ P − = P (otherwise A is contained in some larger antichain).
• P + ∩ P − = A (otherwise A isn’t an antichain).
So P + and P − are posets smaller than P , each of which contains A as a maximum antichain. By induction,
each P ± has a chain cover of size |A|. So for each a ∈ A, there is a chain Ca+ ⊆ P + and a chain Ca− ⊆ P −
with a ∈ Ca+ ∩ Ca− , and
Ca ∩ Ca− : a ∈ A}
+
If we switch “chain” and “antichain”, then Dilworth’s theorem remains true and becomes a much easier
result.
Proposition 11.2.3 (Mirsky’s Theorem). In any finite poset, the minimum size of an antichain cover equals
the maximum size of an chain.
Proof. For the ≥ direction, if C is a chain and A is an antichain cover, then no antichain in A can contain
more than one element of C, so |A| ≥ |C|. On the other hand, let
then {Ai } is an antichain cover whose cardinality equals the length of the longest chain in P .
There is a marvelous common generalization of Dilworth’s and Mirsky’s Theorems due to Curtis Greene and
Daniel Kleitman [GK76, Gre76]. An excellent source on this topic, including multiple proofs, is the survey
article [BF01] by Thomas Britz and Sergey Fomin.
1 I.e., a chain of size m(P ) — not merely a chain that is maximal with respect to inclusion, which might have smaller
cardinality.
227
Theorem 11.2.4 (Greene-Kleitman). Let P be a finite poset. Define two sequences of positive integers
λ = (λ1 , λ2 , . . . , λ` ), µ = (µ1 , µ2 , . . . , µm )
by
λ1 + · · · + λk = max |C1 ∪ · · · ∪ Ck | : Ci ⊆ P chains ,
µ1 + · · · + µk = max |A1 ∪ · · · ∪ Ak | : Ai ⊆ P disjoint antichains .
Then:
1. λ and µ are both partitions of |P |, i.e., weakly decreasing sequences whose sum is |P |.
2. λ and µ are conjugates (written µ = λ̃): the row lengths of λ are the column lengths in µ, and vice
versa.
Note that Dilworth’s Theorem is just the special case µ1 = `. As an example, the poset with Hasse diagram
has
λ = (3, 2, 2, 2) = and µ = (4, 4, 1) = = λ̃.
How many different necklaces can you make with four blue, two green, and one red bead?
It depends what “different” means. The second necklace can be obtained from the first by rotation, and the
third by reflection, but the fourth one is honestly different from the first two.
If we just wanted to count the number of ways to permute four blue, two green, and one red beads, the
answer would be the multinomial coefficient
7 7!
= = 105.
4, 2, 1 4! 2! 1!
However, what we are really trying to count is orbits under a group action.
228
Let G be a group and X a set. An action of G on X is a group homomorphism α : G → SX , the group of
permutations of X.
Equivalently, an action can also be regarded as a map G × X → X, sending (g, x) to gx, such that
• IdG x = x for every x ∈ X (where IdG denotes the identity element of G);
• g(hx) = (gh)x for every g, h ∈ G and x ∈ X.
To go back to the necklace problem, we now see that “same” really means “in the same orbit”. In this
case, X is the set of all 105 necklaces, and the group acting on them is the dihedral group D7 (the group of
symmetries of a regular heptagon). The number we are looking for is the number of orbits of D7 .
Lemma 11.3.1. Let x ∈ X. Then |Ox ||Sx | = |G|.
Proof. The element gx depends only on which coset of Sx contains g, so |Ox | is the number of cosets, which
is |G|/|Sx |.
Proposition 11.3.2. [Burnside’s Theorem] The number of orbits of the action of G on X equals the average
number of fixed points:
1 X
#{x ∈ X : gx = x}
|G|
g∈G
Proof. For a sentence P , let χ(P ) = 1 if P is true, or 0 if P is false (the “Garsia chi function”). Then
X 1 1 X
Number of orbits = = |Sx |
|Ox | |G|
x∈X x∈X
1 XX
= χ(gx = x)
|G|
x∈X g∈G
1 XX 1 X
= χ(gx = x) = #{x ∈ X : gx = x}.
|G| |G|
g∈G x∈X g∈G
229
Therefore, the number of orbits is
105 + 7 · 3 126
= = 9,
|D7 | 14
which is much more pleasant than trying to count them directly. J
Example 11.3.4. Suppose we wanted to find the number of orbits of 7-bead necklaces with 3 colors, without
specifying how many times each color is to be used.
k 7 + 7k 4 + 6k
. (11.4)
14
J
As this example indicates, it is helpful to look at the cycle structure of the elements of G, or more precisely
on their images α(g) ∈ SX .
Proposition 11.3.5. Let X be a finite set, and let α : G → SX be a group action. Color the elements of
X with k colors, so that G also acts on the colorings.
1. For g ∈ G, the number of fixed points of the action of g is k ` (g), where `(g) is the number of cycles in
the disjoint-cycle representation of α(g).
2. Therefore,
1 X `(g)
#equivalence classes of colorings = k . (11.5)
|G|
g∈G
Let’s rephrase Example 11.3.4 in this notation. The identity has cycle-shape 1111111 (so ` = 7); each of the
six reflections has cycle-shape 2221 (so ` = 4); and each of the seven rotations has cycle-shape 7 (so ` = 1).
Thus (11.4) is an example of the general formula (11.5).
Example 11.3.6. How many ways are there to k-color the vertices of a tetrahedron, up to moving the
tetrahedron around in space?
Here X is the set of four vertices, and the group G acting on X is the alternating group on four elements.
This is the subgroup of S4 that contains the identity, of cycle-shape 1111; the eight permutations of cycle-
shape 31; and the three permutations of cycle-shape 22. Therefore, the number of colorings is
k 4 + 11k 2
.
12
J
230
11.4 Grassmannians
A standard reference for everything in this and the following section is Fulton [Ful97].
One motivation for the combinatorics of partitions and tableaux comes from classical enumerative geometric
questions like this:
Problem 11.4.1. Let there be given four lines L1 , L2 , L3 , L4 in R3 in general position. How many lines M
meet each of L1 , L2 , L3 , L4 nontrivially?
To a combinatorialist, “general position” means “all pairs of lines are skew, and their direction vectors are as
linearly independent as possible — that is, the matroid they represent is U3 (4).” To a probabilist, it means
“choose the lines randomly according to some reasonable measure on the space of all lines.” So, what does
the space of all lines look like?
In general, if V is a vector space over a field F (which we will henceforth take to be R or C), and 0 ≤ k ≤
dim V , then the space of all k-dimensional vector subspaces of V is called the Grassmannian Gr(k, V ).
(Warning: this notation varies considerably from source to source.) As we will see, Gr(k, V ) has many nice
properties:
where ≤ is the usual partial order on Young’s lattice (i.e., containment of Ferrers diagrams).
• When F = C, the Poincaré polynomial of Gr(k, Cn ), i.e., the Hilbert series of the cohomology ring of
n C ), is the rank-generating function for the graded poset Yk,n , namely, the q-binomial coefficient
n 2
Gr(k,
k q.
To accomplish all this, we need some way to describe points of the Grassmannian. For as long as possible,
we won’t worry about the ground field.
Let W ∈ Gr(k, Fn ); that is, W is a k-dimensional subspace of V = Fn . We can describe W as the column
2 If these terms don’t make sense, here is a sketch of what you need to know. The cohomology ring H ∗ (X) = H ∗ (X; Q)
of a space X is just some ring that is a topological invariant of X. If X is a reasonably civilized space — say, a compact
finite-dimensional real or complex manifold, or a finite simplicial complex — then H ∗ (X) is a graded ring H 0 (X) ⊕ H 1 (X) ⊕
· · · ⊕ H d (X), where d = dim X, and each graded piece H i (X) is a finite-dimensional Q-vector space. The Poincaré polynomial
records the dimensions of these vector spaces as a generating function:
d
X
Poin(X, q) = dimQ H i (X) q i .
i=0
For lots of spaces, this polynomial has a nice combinatorial formula. For instance, take X = RP d (real projective d-space). It
turns out that H ∗ (X) ∼ = Q[z]/(z n+1 ). Each graded piece H i (X), for 0 ≤ i ≤ d, is a 1-dimensional Q-vector space (generated
by the monomial xi ), and Poin(X, q) = 1 + q + q 2 + · · · + q d = (1 − q d+1 )/(1 − q). In general, if X is a compact orientable
manifold, then Poincaré duality implies (among other things) that Poin(X, q) is a palindrome.
231
space of a n × k matrix M of full rank:
m11 ··· m1k
.. .. .
M = . .
mn1 ··· mnk
However, the Grassmannian is not simply the space Z of all such matrices, because many different matrices
can have the same column space. Specifically, any invertible column operation on M leaves its column space
unchanged. On the other hand, every matrix whose column space is W can be obtained from M by some
sequence of invertible column operations; that is, by multiplying on the right by some invertible k ×k matrix.
Accordingly, it makes sense to write
Gr(k, Fn ) = Z/GLk (F). (11.6)
That is, the k-dimensional subspaces of Fn can be identified with the orbits of Z under the action of the
general linear group GLk (F). In fact, as one should expect from (11.6),
dim Gr(k, Fn ) = dim Z − dim GLk (F) = nk − k 2 = k(n − k)
where “dim” means dimension as a manifold over F; note that dim Z = nk because Z is a dense open subset
of Fn×k . (Technically, this dimension calculation does not follow from (11.6) alone; you need to know that
the action of GLk (F) on Z is suitably well-behaved. Nevertheless, we will soon be able to calculate the
dimension of Gr(k, Fn ) more directly.)
We now want to find a canonical representative for each GLk (F)-orbit. In other words, given W ∈ Gr(k, Fn ),
we want the “nicest” matrix whose column space is W . How about the reduced column-echelon form? Basic
linear algebra says that we can pick any matrix with column space W and perform Gauss-Jordan elimination
on its columns, ending up with a uniquely determined matrix M = M (W ) with the following properties:
• colspace M = W .
• The top nonzero entry of each column of M (the pivot in that column) is 1.
• Let pi be the row in which the ith column has its pivot. Then 1 ≤ p1 < p2 < · · · < pk ≤ n.
• Every entry below a pivot of M is 0, as is every entry to the right of a pivot.
• The remaining entries of M (i.e., other than the pivots and the 0s just described) can be anything
whatsoever, depending on what W was in the first place.
For example, if n = 4 and k = 2, then M will have one of the following six forms:
1 0 1 0 1 0 ? ? ? ? ? ?
0 1 0 ? 0 ? 1 0 1 0 ? ?
0
(11.7)
0 0 1 0 ? 0 1 0 ? 1 0
0 0 0 0 0 1 0 0 0 1 0 1
Note that there is only one subspace W for which M ends up with the first form. At the other extreme, if
the ground field F is infinite and you choose the space W randomly (for a suitable definition of “random”;
consult your local probabilist), then you will almost always end up with a matrix M of the last form.
Definition 11.4.2. Let 0 ≤ k ≤ n and let p = {p1 < · · · < pk } ∈ [n]
k (i.e., p1 , . . . , pk are distinct elements
of [n], ordered least to greatest). The Schubert cell Ωp is the set of all elements W ∈ Gr(k, Fn ) such that,
for every i, the ith column of M (W ) has its pivot in row pi .
Theorem 11.4.3. 1. Every W ∈ Gr(k, Fn ) belongs to exactly one Schubert cell; that is, Gr(k, Fn ) is the
disjoint union of the subspaces Ωp , for p ∈ [n]
k .
2. For every p ∈ [n]
k , there is a diffeomorphismbijection
∼
→ F|p|
Ωp −
k+1
where |p| = (p1 − 1) + (p2 − 2) + · · · + (pk − k) = p1 + p2 + · · · + pk − 2 .
232
3. Define a partial order on [n]
k as follows: for p = {p1 < · · · < pk } and q = {q1 < · · · < qk }, set p ≥ q
if pi ≥ qi for every i. Then
p ≥ q =⇒ Ωp ⊇ Ωq . (11.8)
[n]
4. The poset k is isomorphic to the interval Yk,n in Young’s lattice.
5. Gr(k, Fn ) is a compactification of the Schubert cell Ω(n−k+1,n−k+2,...,n) ∼ = Fk(n−k) . In particular,
dimF Gr(k, Fn ) = k(n − k).
For (2), the map Ωp → F|p| is given by reading off the ?s in the reduced column-echelon form of M (W ).
(For instance, let n = 4 and k = 2. Then the matrix representations in (11.7) give explicit diffeomorphisms
of the Schubert cells of Gr(k, Fn ) to F0 , F1 , F2 , F2 , F3 , F4 respectively.) The number of ?s in the i-th column
is pi − i (pi − 1 entries above the pivot, minus i − 1 entries to the right of previous pivots), so the total
number of ?s is |p|.
For (3): This is best illustrated by an example. Consider the second matrix in (11.7):
1 0
0 z
M = 0
1
0 0
where I have replaced the entry labeled ? by a parameter z. Here’s the trick: Multiply the second column
of this matrix by the scalar 1/z. Doing this doesn’t change the column span, i.e.,
1 0
0 1
colspace M = colspace 0 1/z .
0 0
Now you can see that
1 0
0 1
lim colspace M = colspace lim M = colspace
0
|z|→∞ |z|→∞ 0
0 0
which is the first matrix in (11.7). Therefore, the Schubert cell Ω1,2 is in the closure of the Schubert cell
Ω1,3 . In general, decrementing a single element of p corresponds to taking a limit of column spans in this
way, so the covering relations in the poset [n]
k give containment relations of the form (11.8).
Assertion (4) is purely combinatorial. The elements of Yk,n are partitions λ = (λ1 , . . . , λk ) such that
n − k ≥ λ1 > · · · > λk ≥ 0. The desired poset isomorphism is p 7→ λp = (pk − k, pk−1 − (k − 1), . . . , p1 − 1).
For example, starting with (11.7)
1 0 1 0 1 0 ? ? ? ? ? ?
0 1 0 ? 0 ? 1 0 1 0 ? ?
Matrix
0 0
0 1
0 ?
0 1
0 ?
1 0
0 0 0 0 0 1 0 0 0 1 0 1
p 12 13 14 23 24 34
λp ∅
233
[n]
(5) now follows because p = (n − k + 1, n − k + 2, . . . , n) is the unique maximal element of k , and an easy
calculation shows that |p| = k(n − k).
This theorem amounts to a description of Gr(k, Fn ) as a cell complex. (If you have not heard the term “cell
complex” before, now you know what it means: a topological space that is the disjoint union of cells —
that is, of homeomorphic copies of vector spaces — such that the closure of every cell is itself a union of
cells.) Furthermore, the poset isomorphism with Yk,n says that for every i, the number of cells of Gr(k, Fn )
of dimension i is precisely the number of Ferrers diagrams with i blocks that fit inside the rectangle k n−k .
Combinatorially, we may write this equality as follows:
X
i
X
n−k i n
(# Schubert cells of dimension i) q = #{λ ⊆ k }q = .
i i
k q
Example 11.4.4. If k = 1, then Gr(1, Fn ) is the space of lines through the origin in Fn ; that is, projective
space FP n−1 . As a cell complex, this has one cell of every dimension. For instance, the projective plane is
the union of three cells of dimensions 2, 1, and 0, i.e., a plane, a line and a point. In the standard geometric
picture, the 1-cell and 0-cell together form the “line at infinity”. Meanwhile, the interval Yk,n is a chain of
rank n − 1. Its rank-generating function is 1 + q + q 2 + · · · + q n−1 . (For F = C, double the dimensions of all
the cells, and substitute q 2 for q.) J
Remark 11.4.5. If F = C, then Gr(k, Cn ) is a cell complex with no odd-dimensional cells (because, topo-
logically, the dimension of cells is measured over R). Therefore, readers who know some algebraic topology
(see, e.g., [Hat02, §2.2]) may observe that the cellular boundary maps are all zero (because each one has
either zero domain or zero range), so the cellular homology groups are exactly the chain groups. That is, the
Poincaré series of Gr(k, Cn ) is exactly the generating function for the dimensions of the cells. On the other
hand, If F = R, then the boundary maps need not be zero, and the homology can be more complicated.
Indeed, Gr(1, Rn ) = RP n−1 has torsion homology in odd dimensions.
Example 11.4.6. Let n = 4 and k = 2. Here is Yk,n :
These six partitions correspond to the six matrix-types in (11.7). The rank-generating function is
(1 − q 4 )(1 − q 3 )
4
= = 1 + q + 2q 2 + q 3 + q 4 .
2 q (1 − q 2 )(1 − q)
J
Remark 11.4.7. What does all this have to do with enumerative geometry questions such as Problem 11.4.1?
The answer (modulo technical details) is that the cohomology ring H ∗ (X) encodes intersections of sub-
varieties3 of X: for every subvariety Z ⊆ Gr(k, Fn ) of codimension i, there is a corresponding element
3 If you are more comfortable with differential geometry than algebraic geometry, feel free to think “submanifold” instead of
“subvariety”.
234
[Z] ∈ H i (X) (the “cohomology class of Z”) such that [Z ∪ Z 0 ] = [Z] + [Z 0 ] and [Z ∩ Z 0 ] = [Z][Z 0 ]. These
equalities hold only if Z and Z 0 are in general position with respect to each other (which has to be defined
precisely), but the consequence is that Problem 11.4.1 reduces to a computation in H ∗ (Gr(k, Fn )): find the
cohomology class [Z] of the subvariety
and compare [Z]4 to the cohomology class [•] of a point. In fact, [Z]4 = 2[•]; this says that the answer
to Problem 11.4.1 is two, which is hardly obvious! To carry out this calculation, one needs to calculate
an explicit presentation of the ring H ∗ (Gr(k, Fn )) as a quotient of a polynomial ring (which requires the
machinery of line bundles and Chern classes, but that’s another story) and then figure out how to express
the cohomology classes of Schubert cells with respect to that presentation. This is the theory of Schubert
polynomials.
There is a corresponding theory for the flag variety, which is the set F `(n) of nested chains of vector spaces
F• = (0 = F0 ⊆ F1 ⊆ · · · ⊆ Fn = Fn )
or equivalently saturated
chains in the (infinite) lattice Ln (F). The flag variety is in fact a smooth manifold
over F of dimension n2 . Like the Grassmannian, it has a decomposition into Schubert cells Xw , which are
indexed by permutations w ∈ Sn rather than partitions, as we now explain.
For every flag F• , we can find a vector space basis {v1 , . . . , vn } for Fn such that Fk = Fhv1 , . . . , vk i for all k,
and represent F• by the invertible matrix M ∈ G = GL(n, F) whose columns are v1 , . . . , vn . OTOH, any
ordered basis of the form
where bkk 6= 0 for all k, defines the same flag. That is, a flag is a coset of B in G, where B is the subgroup
of invertible upper-triangular matrices (the Borel subgroup). Thus the flag variety can be (and often is)
regarded as the quotient G/B. This immediately implies that it is an irreducible algebraic variety (as G
is irreducible, and any image of an irreducible variety is irreducible). Moreover, it is smooth (e.g., because
every point looks like every other point, and so either all points are smoothor all points are singular and
the latter is impossible) and its dimension is (n − 1) + (n − 2) + · · · + 0 = n2 .
As in the case of the Grassmannian, there is a canonical representative for each coset of B, obtained by
Gaussian elimination, and reading off its pivot entries gives a decomposition
a
F `(n) = Xw .
w∈Sn
Recall that this is the rank function of the Bruhat and weak Bruhat orders on Sn . In fact, the (strong)
Bruhat order is the cell-closure partial order (analogous to (11.8)). It follows that the Poincaré polynomial
of F `(n) is the rank-generating function of Bruhat order, namely
(1 + q)(1 + q + q 2 ) · · · (1 + q + · · · + q n−1 ).
235
More strongly, it can be shown that the cohomology ring H ∗ (F `(n); Z) is the quotient of Z[x1 , . . . , xn ] by
the ideal generated by symmetric functions.
where ≤ means (strong) Bruhat order (see Ex. 1.2.13). These are much-studied objects in combinatorics;
for example, determining which Schubert varieties is singular turns out to to be a combinatorial question
involving the theory of pattern avoidance. Even more generally, instead of Sn , start with any finite Coxeter
group G (roughly, a group generated by elements of order two — think of them as reflections). Then G has a
combinatorially well-defined partial order also called the Bruhat order, and one can construct a G-analogue
of the flag variety: that is, a smooth manifold whose structure as a cell complex is given by Bruhat order
on G.
We now describe the calculation of the cohomology ring of F `(n) using Chern classes. This is not intended
to be self-contained, and many facts will be presented as black boxes. The reader who wants the full story
should see a source such as [BT82].
Definition 11.5.1. Let B and F be topological spaces. A bundle with base B and fiber F is a space
E together with a map π : E → B such that
1. If b ∈ B, then π −1 (b) ∼
= F ; and, more strongly,
2. Every b ∈ B has an open neighborhood U of b such that V := π −1 (U ) ∼
= U × F , and π|V is just
projection on the first coordinate.
Think of a bundle as a family of copies of F parameterized by B and varying continuously. The simplest
example of a bundle is a Cartesian product B × F with π(b, f ) = b; this is called a trivial bundle. Very
often the fiber is a vector space of dimension d, when we call the bundle a vector bundle of rank d; when
d = 1 the bundle is a line bundle.
Frequently we require all these spaces to lie in a more structured category than that of topological spaces,
and we require the projection map to be a morphism in that category (e.g., manifolds with diffeomorphisms,
or varieties with algebraic maps).
Example 11.5.2. An example of a nontrivial bundle is a Möbius strip M , where B = S 1 is the central circle
and F = [0, 1] is a line segment. Indeed, a Möbius strip looks like a bunch of line segments parameterized
by a circle, and if U is any small interval in S 1 then the part of the bundle lying over U is just U × [0, 1].
However, the global structure of M is not the same as the cylinder S 1 × I. J
Example 11.5.3. Another important example is the tautological bundle on projective space Pd−1 F =
Gr(1, Fd ). Recall that this is the space of lines ` through the origin in Fd . The tautological bundle4 T is the
line bundle defined by T` = `. That is, the fiber over a line is just the set of points on that line. J
Let F be either R or C, and let us work in the category of closed compact manifolds over F. A vector
bundle of rank d is a bundle whose fiber is Fd . (For example, the tautological bundle is a vector bundle of
rank 1.) Standard operations on vector spaces (direct sum, tensor product, dual, etc.) carry over to vector
bundles, defined fiberwise.
Let E be a rank-d vector bundle over M . Its projectivization P(E) is the bundle with fiber Pd−1
F defined
by
P(E)m = P(Em ).
4 The standard symbol for the tautological bundle is actually O(−1); let’s not get into why.
236
That is, a point in P(E) is given by a point m ∈ M and a line ` through the origin in Em ∼
= Fd . In turn,
P(E) has a tautological line bundle L = L (E) whose fiber over (`, m) is `.
Associated with the bundle E are certain Chern classes ci (E) ∈ H 2i (M ) for every i, which measure “how
twisty E is.” (The 2 happens because we are talking about a complex manifold.) I will not define these
classes precisely (see [BT82]), but instead will treat them as a black box that lets us calculate cohomology.
The Chern classes have the following properties:
1. c0 (E) = 1 by convention.
2. ci (E) = 0 for i > rank E.
3. If E is trivial then ci (E) = 0 for i > 0.
4. If 0 →PE 0 → E → E 00 → 0 is an exact sequence of M -bundles, then c(E) = c(E 0 )c(E 00 ), where
c(E) = i ci (E) (the “total Chern class”).
5. For a line bundle L, c1 (L∗ ) = −c1 (L).
Here is the main formula, which expresses the cohomology ring of a bundle as a module over the cohomology
of its base.
H ∗ (P(E); Z) = H ∗ (M ; Z)[x]/hxd + c1 (E)xd−1 + · · · + cd−1 (E)x + cd (E)i (11.9)
where x = c1 (L ).
Example 11.5.4 (Projective space). Pd−1 C is the projectivization of the trivial rank-d bundle over
M = {•}. Of course H ∗ (M ; Z) = Z, so H ∗ (Pd−1 C; Z) = Z[x]/hxd i. J
Example 11.5.5 (The flag variety F `(3)). Let M = P2 = Gr(1, C3 ). Define a bundle E 2 by
E`2 = C3 /`.
Then E 2 has rank 2, and P(E 2 ) is just the flag variety F `(3), because specifying a line in C3 /` is the same
thing as specifying a plane in C3 containing `. Let L = L (E 2 ). For each ` ∈ M we have an exact sequence
0 → ` → C3 → C3 /` → 0, which gives rise to a short exact sequence of bundles
0 → O → C3 → E 2 → 0
where O is the tautological bundle on M , with c1 (O) = x (the generator of H ∗ (M )). The rules for Chern
classes then so the rules for Chern classes tell us that
(1 + x)(1 + c1 (E 2 ) + c2 (E 2 )) = 1
x + c1 (E 2 ) = 0, xc1 (E 2 ) + c2 (E 2 ) = 0
237
X1 = P(E0 ). Let E1 be the rank-(n − 1) bundle whose fiber over a line E1 is Cn /E1 .
X2 = P(E1 ). This is the partial flag variety of flags E• : 0 = E0 ⊆ E1 ⊆ E2 . Let E2 be the rank-(n − 2)
bundle whose fiber over E• is Cn /E2 .
We end up with generators x1 , . . . , xn , one for the tautological bundle of each Ei . The relations turn out to
be the symmetric functions on them. That is.
H ∗ (F `(n)) ∼
= Q[x1 , . . . , xn ]/he1 , e2 , . . . , en i
The Poincare polynomial of the flag variety (i.e., the Hilbert series of its cohomology ring) can be worked
out explicitly. Modulo the elementary symmetric functions, every polynomial can be written as a sum of
monomials of the form
xa1 1 xa2 2 · · · xann
where ai < i for all i. Therefore,
X
Poin(F `(n), q) = q k dimQ H 2i (F `(n)) = (1)(1 + q)(1 + q + q 2 ) · · · (1 + q + · · · + q n−1 ) = [q]n !.
k
where Sn is the symmetric group on n letters and inv(w) is the number of inversions:
In fact the flag variety has a natural cell decomposition into Schubert cells. Given any flag
E• : 0 = E0 ⊆ E1 ⊆ · · · ⊆ En = Cn
construct a n × n matrix [v1 | · · · |vn ] in which the first k columns are a basis of Ek , for every k. We can
canonicalize the matrix as follows:
• Scale the first column so that its bottom nonzero entry is 1. Say this occurs in row w1 .
• Add an appropriate multiple of v1 to each of v2 , . . . , vn so as to kill off the entry in row w1 . Note that
this does not change the flag.
• Scale the second column so that its bottom nonzero entry is 1. Say this occurs in row w2 . Note that
w2 6= w1 .
• Add an appropriate multiple of v2 to each of v3 , . . . , vn so as to kill off the entry in row w1 .
• Repeat.
238
(Here we are really using the description
F `(n) = GLn /B
where B is the Borel subgroup of upper-triangular invertible matrices. The column operations that we have
done correspond to choosing a canonical element of each coset of B in GLn .)
We end up with a matrix that includes a “pivot” 1 in each row and column, with zeroes below and to the
right of every 1. The pivots define a permutation w ∈ Sn . For example, if w = 4132 then the matrix will
have the form
∗ 1 0 0
∗ 0 ∗ 1
∗ 0 1 0 .
1 0 0 0
◦
The set X3142 of all matrices of this type is a subspace of F `(4) that is in fact isomorphic to C3 — the stars
are affine coordinates. Thus we obtain a decomposition into Schubert cells
a
◦
F `(n) = Xw
w∈Sn
and moreover the stars correspond precisely to inversions of w. This gives the Poincaré polynomial.
The closure of a Schubert cell is called a Schubert variety. The cohomology classes of Schubert varieties
are also a vector space basis for H ∗ (F `(n)), and there is a whole theory of how to translate between the
“algebraic” basis (coming from line bundles) and the “geometric” basis (Schubert varieties).
11.6 Exercises
(a) (Warmup) Show that µ(G) ≤ β(G) for every graph G. Exhibit a graph for which the inequality is
strict.
(b) The König-Egerváry Theorem asserts that µ(G) = β(G) whenever G is bipartite, i.e., the vertices
of G can be partitioned as X ∪ Y so that every edge has one endpoint in each of X, Y . Derive the
König-Egerváry Theorem as a consequence of the Max-Flow/Min-Cut Theorem.
(c) Prove that the König-Egerváry Theorem and Dilworth’s Theorem imply each other.
Polyá theory
Exercise 11.3. Let nP≥ 2 and for σ ∈ Sn , let f (σ) denote the number of fixed points. Prove that for every
1 k
k ≥ 1, the number n! σ∈Sn f (σ) is an integer.
239
Appendix: Catalan Numbers
The Catalan numbers are ubiquitous in combinatorics. A famous exercise in volume 2 of Stanley’s Enumer-
ative Combinatorics [Sta99, Exercise 6.19] lists 66 combinatorial interpretations of the Catalan numbers and
asks the reader to come up with 66 2 bijections between them. That was in 1999; more recently, Stanley
wrote an entire monograph [Sta15] with 214 interpretations. Here we’ll just review the basics.
A Dyck path of size n is a path from (0, 0) to (2n, 0) in R2 consisting of n up-steps and n down-steps
that stays (weakly) above the x-axis.
We can denote Dyck paths efficiently by a list of U’s and D’s; the path P shown above is UUDUUDDD. Each
up-step can be thought of as a left parenthesis, and each down-step as a right parenthesis, so we could also
write P = (()(())). The requirement of staying above the x-axis then says that each right parenthesis must
close a previous left parenthesis.
Proposition 11.6.1. The number of Dyck paths of size n is the Catalan number Cn .
Sketch of proof. The proof is an illustration of the Sheep Principle (“in order to count the sheep in a flock,
count the legs and divide by four”). Consider the family L of all lattice paths from (0, 0) to (2n + 1, −1)
consisting of n up-steps and n + 1 down-steps (with no restrictions); evidently |L| = 2n+1
n .
Consider the action of the cyclic group Z2n+1 on L by cyclic rotation. First, the orbits all have size 2n + 1.
(There is no way that a nontrivial element of Z2n+1 can fix the locations of the up-steps, essentially because
gcd(2n + 1, n) = 1 — details left to the reader.) Second, each orbit contains exactly one augmented Dyck
240
path, i.e., a Dyck path followed by a down-step. (Of all the lowest points in a path, find the leftmost one
and call it z. Rotate so that the last step is the down-step that lands at z.)
Figure 11.6: Rotating the lattice path UDDUDD|UDU to obtain the augmented Dyck path UDU|UDDUDD.
Every (augmented) Dyck path arises in this way, so we have a bijection. The orbits are sheep and each sheep
has 2n + 1 legs, so the number of Dyck paths is
1 2n + 1 (2n + 1)! (2n)! (2n)! 1 2n
= = = = .
2n + 1 n (2n + 1) (n + 1)! n! (n + 1)! n! (n + 1) n! n! n+1 n
To show that a class of combinatorial objects is enumerated by the Catalan numbers, one can now find a
bijection to Dyck paths. A few of the most commonly encountered interpretations of Cn are:
Others will be encountered in the course of these notes. For details, see [Sta99] or [Sta15] Another core
feature of the Catalan numbers is that they satisfy the following recurrence:
n−1
X
Cn = Cn−1 + Ck−1 Cn−k for n ≥ 1. (11.10)
k=1
This equation can be checked by a banal induction argument, but it is also worthwhile seeing the combina-
torial reason for it. Call a Dyck path of size n primitive if it stays strictly above the x-axis for 0 < x < 2n.
If a path P is primitive, then it is of the form UP 0 D for some Dyck path P 0 of size n − 1 (not necessarily
primitive); this accounts for the Cn−1 term in the Catalan recurrence. Otherwise, let (2k, 0) be the smallest
positive x-intercept, so that 1 ≤ k ≤ n − 1. The part of the path from (0, 0) to (2k, 0) is a primitive Dyck
path of size k, and the part from (2k, 0) to (2n, 0) is a Dyck path of size n − k, not necessarily primitive.
241
Notational Index
Basics
J End of an example
[n] {1, . . . , n}
N nonnegative integers 0, 1, 2, . . . \mathbb{N}
N>0 positive integers 1, 2, . . . \mathbb{P}
2S power set of a set S (or the associated poset)
∪· disjoint union \cupdot (requires header.tex)
4 symmetric difference A4B = (A ∪ B) \ (A ∩ B) \triangle
Sn symmetric group on n letters \mathfrak{S}_n
S
k set of k-element subsets of a set S \binom{S}{k}
Cn Catalan numbers
khv1 , . . . , vn i k-vector space with basis {v1 , . . . , vn }
Posets
Lattices
242
Poset Algebra
Hyperplane Arrangements
243
Representation Theory
Symmetric Functions
244
Combinatorial Algebraic Varieties
µ product
∆ coproduct
u unit
ε counit
S antipode
245
Bibliography
[AA17] Marcelo Aguiar and Federico Ardila, Hopf monoids and generalized permutahedra, preprint,
arXiv:1709.07504, 2017. 132, 218
[ABS06] Marcelo Aguiar, Nantel Bergeron, and Frank Sottile, Combinatorial Hopf algebras and
generalized Dehn-Sommerville relations, Compos. Math. 142 (2006), no. 1, 1–30. MR 2196760
218
[AM10] Marcelo Aguiar and Swapneel Mahajan, Monoidal functors, species and Hopf algebras, CRM
Monograph Series, vol. 29, American Mathematical Society, Providence, RI, 2010, With
forewords by Kenneth Brown and Stephen Chase and André Joyal. MR 2724388 218
[Ath96] Christos A. Athanasiadis, Characteristic polynomials of subspace arrangements and finite fields,
Adv. Math. 122 (1996), no. 2, 193–233. MR 1409420 (97k:52012) 99
[BB05] Anders Björner and Francesco Brenti, Combinatorics of Coxeter Groups, Graduate Texts in
Mathematics, vol. 231, Springer, New York, 2005. MR 2133266 (2006d:05001) 16
[BF01] Thomas Britz and Sergey Fomin, Finite posets and Ferrers shapes, Adv. Math. 158 (2001),
no. 1, 86–127. MR 1814900 227
[BH93] Winfried Bruns and Jürgen Herzog, Cohen-Macaulay Rings, Cambridge Studies in Advanced
Mathematics, vol. 39, Cambridge University Press, Cambridge, 1993. MR 1251956 (95h:13020)
114, 119
[BLVS+ 99] Anders Björner, Michel Las Vergnas, Bernd Sturmfels, Neil White, and Günter M. Ziegler,
Oriented matroids, second ed., Encyclopedia of Mathematics and its Applications, vol. 46,
Cambridge University Press, Cambridge, 1999. MR 1744046 108
[BM71] H. Bruggesser and P. Mani, Shellable decompositions of cells and spheres, Math. Scand. 29
(1971), 197–205 (1972). MR 0328944 (48 #7286) 130
[BO92] Thomas Brylawski and James Oxley, The Tutte polynomial and its applications, Matroid
applications, Encyclopedia Math. Appl., vol. 40, Cambridge Univ. Press, Cambridge, 1992,
pp. 123–225. MR 1165543 (93k:05060) 86
[Bol98] Béla Bollobás, Modern Graph Theory, Graduate Texts in Mathematics, vol. 184,
Springer-Verlag, New York, 1998. MR 1633290 (99h:05001) 78
[BR07] Matthias Beck and Sinai Robins, Computing the Continuous Discretely, Undergraduate Texts
in Mathematics, Springer, New York, 2007, Integer-point enumeration in polyhedra. MR
2271992 (2007h:11119) 84
246
[Bri73] Egbert Brieskorn, Sur les groupes de tresses [d’après V. I. Arnol0 d], Séminaire Bourbaki,
24ème année (1971/1972), Exp. No. 401, Springer, Berlin, 1973, pp. 21–44. Lecture Notes in
Math., Vol. 317. 104, 105
[BT82] Raoul Bott and Loring W. Tu, Differential Forms in Algebraic Topology, Graduate Texts in
Mathematics, vol. 82, Springer-Verlag, New York-Berlin, 1982. MR 658304 (83i:57016) 236, 237
[CR70] Henry H. Crapo and Gian-Carlo Rota, On the Foundations of Combinatorial Theory:
Combinatorial Geometries, preliminary ed., The M.I.T. Press, Cambridge, Mass.-London, 1970.
MR 0290980 99
[Dir61] G.A. Dirac, On rigid circuit graphs, Abh. Math. Sem. Univ. Hamburg 25 (1961), 71–76. MR
0130190 (24 #A57) 104
[FS05] Eva Maria Feichtner and Bernd Sturmfels, Matroid polytopes, nested sets and Bergman fans,
Port. Math. (N.S.) 62 (2005), no. 4, 437–468. MR 2191630 132
[Ful97] William Fulton, Young Tableaux, London Mathematical Society Student Texts, vol. 35,
Cambridge University Press, Cambridge, 1997. 188, 206, 231
[GSS93] Jack Graver, Brigitte Servatius, and Herman Servatius, Combinatorial Rigidity, Graduate
Studies in Mathematics, vol. 2, American Mathematical Society, Providence, RI, 1993. MR
1251062 50
[Hat02] Allen Hatcher, Algebraic Topology, Cambridge University Press, Cambridge, 2002. MR 1867354
(2002k:55001) 114, 116, 117, 234
[HS15] Joshua Hallam and Bruce Sagan, Factoring the characteristic polynomial of a lattice, J.
Combin. Theory Ser. A 136 (2015), 39–63. MR 3383266 104
[Hum90] James E. Humphreys, Reflection Groups and Coxeter Groups, Cambridge Studies in Advanced
Mathematics, vol. 29, Cambridge University Press, Cambridge, 1990. MR 1066460 (92h:20002)
16
[Hum11] Brandon Humpert, Polynomials Associated with Graph Coloring and Orientations, Ph.D.
thesis, University of Kansas, 2011. 217
[Kli] Caroline J. Klivans, A quasisymmetric function for generalized permutahedra, Talk slides,
http://www.dam.brown.edu/people/cklivans/birs.pdf. 218
247
[LS00] Shu-Chung Liu and Bruce E. Sagan, Left-modular elements of lattices, J. Combin. Theory Ser.
A 91 (2000), no. 1-2, 369–385. MR 1780030 48
[MR05] Jeremy L. Martin and Victor Reiner, Cyclotomic and simplicial matroids, Israel J. Math. 150
(2005), 229–240. MR 2255809 71
[MS05] Ezra Miller and Bernd Sturmfels, Combinatorial Commutative Algebra, Graduate Texts in
Mathematics, vol. 227, Springer-Verlag, New York, 2005. MR 2110098 (2006d:13001) 114
[OS80] Peter Orlik and Louis Solomon, Combinatorics and topology of complements of hyperplanes,
Invent. Math. 56 (1980), no. 2, 167–189. 105
[OT92] Peter Orlik and Hiroaki Terao, Arrangements of Hyperplanes, Grundlehren der
Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 300,
Springer-Verlag, Berlin, 1992. 87
[Oxl92] James G. Oxley, Matroid Theory, Oxford Science Publications, The Clarendon Press Oxford
University Press, New York, 1992. 50, 62, 63
[Per16] Xander Perrott, Existence of projective planes, arXiv:1603.05333 (retrieved 8/26/18), 2016. 62
[Pos09] Alexander Postnikov, Permutohedra, associahedra, and beyond, Int. Math. Res. Not. (2009),
no. 6, 1026–1106. MR 2487491 132
[PRW08] Alex Postnikov, Victor Reiner, and Lauren Williams, Faces of generalized permutohedra, Doc.
Math. 13 (2008), 207–273. MR 2520477 132
[Rei] Victor Reiner, Lectures on matroids and oriented matroids, Lecture notes for Algebraic
Combinatorics in Europe (ACE) Summer School in Vienna, July 2005; available at
http://www-users.math.umn.edu/∼reiner/Talks/Vienna05/Lectures.pdf. 108, 112
[RGZ97] Jürgen Richter-Gebert and Günter M. Ziegler, Oriented matroids, Handbook of discrete and
computational geometry, CRC Press Ser. Discrete Math. Appl., CRC, Boca Raton, FL, 1997,
pp. 111–132. MR 1730162 108, 109
[Ryb11] G.L. Rybnikov, On the fundamental group of the complement of a complex hyperplane
arrangement, Funktsional. Anal. i Prilozhen. 45 (2011), no. 2, 71–85. 105
[S+ 14] W. A. Stein et al., Sage Mathematics Software (Version 6.4.1), The Sage Development Team,
2014, http://www.sagemath.org. 88
[Sag01] Bruce E. Sagan, The Symmetric Group, second ed., Graduate Texts in Mathematics, vol. 203,
Springer-Verlag, New York, 2001. 177
[Sch13] Alexander Schrijver, A Course in Combinatorial Optimization, Available online at
http://homepages.cwi.nl/∼lex/files/dict.pdf (retrieved 1/21/15), 2013. 126, 128
[Sta95] Richard P. Stanley, A symmetric function generalization of the chromatic polynomial of a
graph, Adv. Math. 111 (1995), no. 1, 166–194. MR 1317387 210
[Sta96] , Combinatorics and Commutative Algebra, second ed., Progress in Mathematics,
vol. 41, Birkhäuser Boston Inc., Boston, MA, 1996. MR 1453579 (98h:05001) 114
248
[Sta07] , An introduction to hyperplane arrangements, Geometric combinatorics, IAS/Park City
Math. Ser., vol. 13, Amer. Math. Soc., Providence, RI, 2007, Also available online:
http://www-math.mit.edu/∼rstan/arrangements/arr.html, pp. 389–496. 87, 100, 103, 113
[Sta12] , Enumerative Combinatorics. Volume 1, second ed., Cambridge Studies in Advanced
Mathematics, vol. 49, Cambridge University Press, Cambridge, 2012. MR 2868112 38
[Sta15] , Catalan Numbers, Cambridge University Press, New York, 2015. MR 3467982 240, 241
[Tak71] Mitsuhiro Takeuchi, Free Hopf algebras generated by coalgebras, J. Math. Soc. Japan 23 (1971),
561–582. MR 0292876 216
[Tut54] W. T. Tutte, A contribution to the theory of chromatic polynomials, Canad. J. Math. 6 (1954),
80–91. MR 61366 80
[Wes96] Douglas B. West, Introduction to Graph Theory, Prentice Hall, Inc., Upper Saddle River, NJ,
1996. MR 1367739 (96i:05001) 104
[Zas75] Thomas Zaslavsky, Facing up to arrangements: face-count formulas for partitions of space by
hyperplanes, Mem. Amer. Math. Soc. 1 (1975), no. issue 1, 154, vii+102. MR 0357135 93
[Zie95] Günter M. Ziegler, Lectures on polytopes, Graduate Texts in Mathematics, vol. 152,
Springer-Verlag, New York, 1995. MR 1311028 126, 132
249
A “pointillist” picture of the essentialized braid arrangement ess(Br4 ), produced by a computer glitch.
250