0% found this document useful (0 votes)
398 views3 pages

SPBVNT

This document provides a new proof of the Birkhoff-von Neumann Theorem, which states that every doubly stochastic matrix is a convex combination of permutation matrices. The proof shows that the extreme points of the polytope defined by the constraints of the doubly stochastic property correspond to permutation matrices. Therefore, by Straszewicz's theorem every doubly stochastic matrix is a convex combination of its extreme points, which are the permutation matrices. The proof avoids using an inductive argument or the permutation matrix lemma used in other proofs.

Uploaded by

Jayakumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
398 views3 pages

SPBVNT

This document provides a new proof of the Birkhoff-von Neumann Theorem, which states that every doubly stochastic matrix is a convex combination of permutation matrices. The proof shows that the extreme points of the polytope defined by the constraints of the doubly stochastic property correspond to permutation matrices. Therefore, by Straszewicz's theorem every doubly stochastic matrix is a convex combination of its extreme points, which are the permutation matrices. The proof avoids using an inductive argument or the permutation matrix lemma used in other proofs.

Uploaded by

Jayakumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

A SHORT PROOF OF THE BIRKHOFF-VON NEUMANN THEOREM

GLENN HURLBERT

Abstract. The Birkhoff-von Neumann Theorem has been proved many times in the literature
with a number of different methods, some inductive, some constructive, some existential. We offer a
new proof that is a little more direct than most, though nonconstructive.

Key words. Doubly stochastic matrices, Convex combinations, Permutation matrices.

AMS subject classifications. 15A51, 52A20.

1. Introduction. A vector is stochastic if it is nonnegative and its components


sum to 1. A matrix is doubly stochastic (DS) if each of its rows and columns is
stochastic. A permutation matrix is a square {0, 1}-matrix with exactly one 1 per row
and per column. The identity matrix is an example of a permutation matrix; indeed,
every permutation matrix is a rearrangement of the columns (or rows) of an identity
matrix. A permutation matrix is one type of doubly stochastic matrix; in fact, every
integral doubly stochastic matrix is a permutation matrix. It is elementary that every
convex combination of permutation matrices is DS. The converse is a 1936 theorem
of K
onig [7] (Chapter XIV, Section 3, in the context of generalizing the factorization
of regular bipartite graphs), typically attributed instead to the 1946 and 1953 work
of Birkhoff [2] and von Neumann [8], respectively.
Theorem 1.1. Every DS matrix is a convex combination of permutation matrices.
The traditional proof uses induction by removing an appropriate fraction of a
permutation matrix P from the given DS matrix, and various methods have been
found to find such a P , including von Neumanns iterated scheme (similar to our
method below) as well as linear optimization (see [3, 4, 6]) essentially an application
of the integrality theorem for networks. Edmonds proof (given on p. 331 of [3])
applies network theory more directly, instead of to the permutation matrix lemma.
Another interesting proof is found in [9], reminscent of the Frobenius-Konig theorem
(see [1], p.62) characterising 0-permanent matrices. The proof in [5] uses induction
directly to prove Theorem 1.1. Our proof is also direct, avoiding the permutation
matrix lemma; however it is consequently nonconstructive. The motivation for this
Department of Mathematics and Statistics, Arizona State University, Tempe, Arizona 852871804, USA ([email protected]).

G. Hurlbert

approach comes from presenting the material to undergraduates in the context of [6].

Proof. The key idea, as many have pointed out, is to think of an n n matrix as
2
a vector in Rn . The strategy is to use the DS property to impose linear constraints
on such vectors. If the extreme points of the polytope defined by the constraints
correspond to permutation matrices (the bulk of the work in the proof) then the
result follows by Straszewiczs theorem [10] that every polytope is the convex hull of
its extreme points.
We let X = (xr,s ) be an n n DS matrix. The constraints of the system on {xr,s }
that defines the DS property are as follows.
Pn
x =1
Pnr=1 r,s
s=1 xr,s = 1
xr,s 0

(1 r n) (1 s n)

The polyhedron P defined above is a polytope since the linear constraints imply that
each 0 xr,s 1, and so P is bounded.
We now proceed to show that every extreme point of P is integral, by contrapositive. We will show that any nonintegral point of P is the center of some line segment
residing inside P .
Suppose that x P is not integral, and let 0 < xr1 ,s1 < 1. Because of the row
Pn
constraint s=1 xr1 ,s = 1, there must be some s2 such that 0 < xr1 ,s2 < 1. Likewise,
Pn
because of the column constraint r=1 xr,s2 = 1, there must be some r2 such that
0 < xr2 ,s2 < 1. This process can be iterated, and we will stop when some index
(r, s) is repeated. Moreover, we will assume that we chose the iterated process having
the shortest such sequence of indices. Then we know that the final index is the first
repeated index, namely (r1 , s1 ).
We claim that there is some k satisfying (rk , sk ) = (r1 , s1 ); that is, the length of
the sequence is even otherwise a shorter sequence can be found. Suppose not, say
(rk , sk+1 ) = (r1 , s1 ). Then, because (rk , sk+1 ), (r1 , s1 ) and (r1 , s2 ) are all in the same
row, by deleting (r1 , s2 ) and starting instead at (r2 , s2 ) we obtain a valid sequence
that is shorter, a contradiction.
Now let 0 = min{xrj , x1rj , xsj , x1sj }kj=1 . Then for any 0 < < 0 define
x+ () (resp. x ()) by decreasing (resp. increasing) the value of each xrj ,sj by ,
while increasing (resp. decreasing) the value of each xrj ,sj+1 by . Note that x+ ()
(resp. x ()) P . Indeed, increasing xrj ,sj and decreasing xrj ,sj by the same amount
maintains the sum of 1 in row rj , while preventing both xrj ,sj > 1 and xrj ,sj+1 < 0

Birkhoff-von Neumann Theorem

because < 0 . The same argument applies to column sum preservation. This shows
that x+ () P . The analogous argument shows that x () P .
Thus we have shown that the line segment x ()x+ () lies entirely in P and has
x as its center. Therefore, x is not extreme. Hence, every extreme point of P is
integral, and so corresponds to a permutation matrix. Thus every DS matrix is a
convex combination of permutation matrices.
REFERENCES

[1] R.B. Bapat and T.E.S. Raghavan, Nonnegative Matrices and Applications, Encyclopedia of
Mathematics and its Applications 64, Cambridge University Press, Cambridge, 1997.
[2] G. Birkhoff, Three observations on linear algebra, Univ. Nac. Tacum
an Rev. Ser. A 5:147151,
1946.
[3] V. Chv
atal, Linear Programming, W. H. Freeman, New York, 1983.
[4] G.B. Dantzig, Applications of the simplex method to a transportation problem, Activity Analysis
of Production and Allocation, Cowles Commission Monograph 13:359373, John Wiley &
Sons, New York, 1951.
[5] A.J. Hoffman and H.W. Wielandt, The variation of the spectrum of a normal matrix, Duke
Math. J. 20:3739, 1953.
[6] G. Hurlbert, Linear Optimization: The Simplex Workbook, Springer-Verlag, New York, in press.
[7] D. K
onig, Theorie der endlichen und unendlichen Graphen, Akademische. Verlags Gesellschaft,
Leipzig, 1936.
[8] J. von Neumann, A certain zero-sum two-person game equivalent to an optimal assignment
problem, Ann. Math. Studies 28:512, 1953.
[9] J.V. Romanovsky, A simple proof of the Birkhoff-von Neumann theorem on bistochastic matrices,
A tribute to Ilya Bakelman, Discourses Math. Appl. 3:5153, Texas A & M Univ., College
Station, TX, 1994.

[10] S. Straszewicz, Uber


exponierte Punkte abgeschlossener Punktmengen, Fundam. Math. 24:139
143, 1935.

You might also like