0% found this document useful (0 votes)
41 views6 pages

Lazebnik 1996

The document discusses solving systems of linear diophantine equations. It introduces the Smith normal form, which transforms a matrix of coefficients into a diagonal matrix that can be used to determine if an integer solution exists. An algorithm for computing the Smith normal form is also presented. Applications to solving systems of linear diophantine equations and proving van der Waerden's theorem are provided.

Uploaded by

Mesieur Leblanc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views6 pages

Lazebnik 1996

The document discusses solving systems of linear diophantine equations. It introduces the Smith normal form, which transforms a matrix of coefficients into a diagonal matrix that can be used to determine if an integer solution exists. An algorithm for computing the Smith normal form is also presented. Applications to solving systems of linear diophantine equations and proving van der Waerden's theorem are provided.

Uploaded by

Mesieur Leblanc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

VOL. 69, NO.

4, OCTOBER 1996 261

On Systems ofLinear
DiophantineEquations
FELIX LAZEBNIK
ofDelaware
University
Newark, DE 19716

Introduction Somethinghappenedto me recentlyI wouldwagerhas happenedto


manywho read thisnote. Teachinga new topic,you cannotunderstandone of the
proofs.Your firstattemptto fillthe gap fails.You look throughyourbooks foran
answer.Next,you ask colleagues,go to the library, maybeeven use the interlibrary
loan. All in vain. Then it strikesyou that,in fact,you cannotansweran even more
basic and seeminglymore interesting question.You peruse the books again. They
seem to have answersto thousandsof strangequestions,but not to yours(the most
naturalone!). Atthe same timeyoucannotbelievethatyourquestioncouldhavebeen
overlookedby generations of mathematicians. Days pass; the agonycontinues.
Then one day,some wayor other,youfindthe answer.In mycase the answerwas
in a book I alreadyowned. It followedfroma theoremI had knownfora longtime,
but I had neverthoughtof thisparticularapplication.I mustadmit,indeed,thatthis
theoremappeared in almosteverybook I had checked,but neverwitha pointerto
this particularapplication,even as an exercise.Were the authorsunawareof the
application?Or did it seem too obviousto mention?In anycase, here is the story.
In mygraduatecombinatorics course,a proofoftheexistenceof a designwas based
on the followingquestion: Given a systemof linear equations Ax = b, where
A = (a. ) is an m X n matrixwithintegerentries,and b is an m x 1 columnvector
withintegercomponents,does the systemhave an integersolution,i.e. an n X 1
solutionvectorx withintegercomponents?The suggestedmethod([7], Th. 15.6.5)
makesuse of "a well-known theoremof van der Waerden":
THEOREM (van der Waerden). An integersolutionof thesystemexistsif and only
if, for everyrow vector v with rationalcomponentssuch that vA has integer
components, vb is an integer.
I had neverseen thistheorem,and I was surprisedthatsuch a criterioncould be
useful(whichit was!). In trying
to provethe theorem,I realizedthatI did notknow
any good methodforresolvinga morebasic question:

How can one tellwhethera systemof lineardiophantineequationshas a


solution?If solutionsexist,how can onefindany or all ofthem? (*)

I could notfindthisquestionin anyof at least30 modemtextson abstractalgebra


or numbertheory.The place I found it at last was the classical textof van der
Waerden[14, Exercise12.3]. Not forthefirsttimethisbookcontainedan answerthat
I could not findin morerecentsources-why hadn'tI startedwithit? (Interestingly,
the book containsveryfewexercises,but thisone was there.)
The theorybehindthe solutionis closelyrelatedto the famousstructure
theorem
for finitelygeneratedabelian groups,or, more generally,for finitelygenerated
modulesoverprincipalideal domains.Variousproofscan be foundin manybookson
abstractalgebra,e.g., see [8]. We presenta matrixversionof the theorem.Let Z
denote the ring of integers,Mm, Z), 1 < m < n, the ring of all integerm X n

This content downloaded from 143.207.2.50 on Mon, 29 Jul 2013 02:22:10 AM


All use subject to JSTOR Terms and Conditions
262 MATHEMATICS MAGAZINE

matrices,SLk(Z) the set of all square k x k matriceswith integerentriesand


determinant 1 or -1 (unimodularmatrices).By D = diag(dl,d2,. .., djE) E Mm ,(L)
we denotethe diagonalmatrixthathas an integerd, in the (i, i) entry,i = 1,... , ml
and zeros elsewhere.Then we have:
THEOREM 1. Let A E Mm,n ThereexistL E SLm(7Z)and R E SLn(Z) such that

LAR=D=diag(dl,d2,..., ds,0...,0),
wheredi > 0,i=1,. . .,s, and dildid+,i = 1,..., s - 1.
A proofcan be found,e.g., in [14] or [8]. The idea is to use elementaryoperations
of rowsand columnsof A. MatricesL and R correspondto compositions of these
operations.Though matricesL and R in Theorem 1 mayvary,the matrixD is
uniquelydefinedby A and it is called the Smithnormalformof A.
Let us noteimmediately thatTheorem1 can be used to answerquestion(*). Given
Ax = b, rewriteit as Dy= c withRy= x, LAR = D andc = Lb. But the solutionto
the diagonalsystemDy = c is easy. More detailsand a numericalexampleare given
in the Applicationssectionof thispaper.
The questionoffinding an efficient
algorithm forcomputing theSmithnormalform
of an integermatrixis not trivial.It is not clear that the direct applicationof
elementary operationsof rowsand columnsleads to a polynomial-time algorithm:it is
conceivablethatthe integersget too large.For moredetails,see [11] and [3].

Some history Theorem1 has an interesting history:Question(*) seemsnotto have


been asked,in fullgenerality,
untilthemid-19thcentury. Its particular
cases appeared
in 1849-1850 in some number-theoretical studiesof Hermite[10, p. 164,p. 265]. In
1858,Heger [9] formulated conditionsforthe solvabilityof Ax = b in thecase where
A has fullrank(i.e., m) overZ. In 1861,theproblemwas solvedin fullgenerality by
H. J.S. Smith[12]. Theorem1 appearedin a formclose to theone above in an 1868
treatiseby Frobenius[5] who generalizedHeger's theorem[5, pp. 171-173], and
emphasizedthe unimodularity of the transformations [5, pp. 194-196].
By thenmanyimportant resultson abeliangroupshad been discovered.Introduced
by Gauss,the conceptof an abeliangroupwas developedbothin number-theoretical
studiesof Gauss, Schering,Kronecker,and Dirichlet,and in the studiesof elliptic
functionsand abelian integralsof Gauss, Abel, and Jacobi. Not until 1879 did
Frobeniusand Stickelberger [6] discoverand use explicitly the connectionbetween
thetheoryoffinitely generatedabeliangroupsand Smith'stheorem.In thesameyear,
FrobeniusshowedthatSmith'stheory(extendedto matricesover polynomialrings)
could be used to classifysquare matricesover fields,up to similarity. (For further
history,see [4] and the HistoricalNotes in [2].) The storyremindsus, in particular,
thatmanybasic notionsand factsof linearalgebra(includingmodule theory)were
developedwithinthe contextof numbertheory.

Applications Our firstapplicationis relatedto question(*). It also containsa proof


of the aforementionedtheoremof van der Waerden. Let Q denote the field of
rationalnumbers.
PROPOSITION 2. Let A, L, R, D be as in Theorem1, b E Z" and c = Lb. Thenthe
fourstatements
following are equivalent:
(1) The systemof linearequationsAx = b has an integersolution
(2) The systemof linearequationsDy = c has an integersolution

This content downloaded from 143.207.2.50 on Mon, 29 Jul 2013 02:22:10 AM


All use subject to JSTOR Terms and Conditions
VOL. 69, NO. 4, OCTOBER 1996 263
(3) For everyrationalvectoru such that u A is an integervector,thenumberub is
an integer
(4) For everyrationalvectorv such that vD is an integervector,thenumbervc is
an integer.
Proof.(1) (2): Indeed, Ax=b (L-'DR-')x=b b D(R-lx)=c Dy=c,
where y = R'x. Since R E SLrn(Z),then R' EGSLm(ZZ).Thereforex E Z7n. y=
R-lx E En.
(3) (4): Indeed, vD E Z7n v(LAR) E Z7n (vL)AR E Z7n (vL)A E znR-
7Zn/ u A E Zn7,whereu = vL. L E SLn(Z7),thenu E Q' Cv E Qlm,and,by(3),
ub E Z. But ub E Z * (vLXL- 1c) E Z o*vc E=Z. Therefore(3) implies(4). Revers-
ing the orderof the argument, we get u A E Zn * vD E Z7nandvc E Z * ub E 7Z.
Therefore(4) implies(3).
(2) (4): Dy = c impliesv(Dy) = vc for everyv E Qm, hence (vD)y = vc. If
vD E Z", thenvc e 7. Thus (2) implies(4). In orderto provethat(4) implies(2), first
we observe that c=(ci,..., c5O0,...,0). For suppose c30O, j>s. Consider v=
(O, 0, 1/(2cj), 0, . . ., 0) where 1/(2cj) appearsin the j-th position.Since vD = 0
E Z7n,then by (4) vc = 1/2 eE7, and we arriveat a contradiction. Thus c; = 0 for
j > s. Next,for i = 1,... , s, we considervectorsv; = (O,.. ., 0, l/d2,O,... , 0). Since
viD E Zn, then by (4), vic E Z and hence c/di E Z. Let y= (y1,Ys.,,O ...,O),
whereyi=ci/di, i = ,...,s. Thenye"Z', and Dy=c. X

With notationsas in Proposition2, one can reduce the solutionof the system
Ax = b to a solutionof Dy = c by performing elementary transformations(over7) of
rowsand columnsof matrixA augmentedby vectorb. MatricesL and R can be
constructedby multiplying matricescorresponding to these transformations.
System
Dy = c has a solutionifand onlyif c,+l = *- = cm= 0, and dilci fori =1,.. ., s. A
general solutionof Dy =c can be given in the formy = (y1,.. . y, t1... tms),
where yi= cdi, i = 1,.. ., s, and t1,.. . , tm_-are freeintegerparameteTs. Then the
generalsolutionof Ax = b is just Ry. Clearly,we mayassumethateach equationis
reducedby the greatestcommondivisorof the coefficients of thevariables.
equationsAx = b, where
EXAMPLE. Solve the systemof diophantine

A= ( 2 4
6) x= (X2), and b 17

Solution.Considera sequence of elementary


transformationsof rowsand columns
of A. It is well knownthattheycan be achievedby multiplying A by unimodular
matrices.Let us representthe transformation
of rowsby 2 X 2 matricesLi and the
ones of columnsby 3 X 3 matricesRj, wherethe lowerindicesreflectthe orderof
We considerthe following
multiplications. transformations
(matrices):

0o 8) R2 ( -2 0) (I= 0 14
Ri=| 0
O 0 R2= 0 1 0 , R3=|O 1 0l

This content downloaded from 143.207.2.50 on Mon, 29 Jul 2013 02:22:10 AM


All use subject to JSTOR Terms and Conditions
264 MATHEMATICS MAGAZINE

Let L=L4 and R=R1R2R3R5R6. Then

DLA (-2 1 ) -5 2 6 )l0 -5


)
-91

and c = Lb =(47-

SolvingDy = c, and takingx = Ry, we get

0O 1 2 17 -47+2t1
x= 1 18 32 -47 = -829+32t1 ,tEV Z
0 -5 -9) t, 235 - 9t1

and theproblemis solved.


Anotherapplicationis concernedwitha specialinstanceofthefollowing fundamen-
tal questionin numbertheory.Let Z[x1,..., xj] denotethe ringof polynomials in t
variableswithintegralcoefficients,and let F(x) E [ x1,. . ., xt]. It is clear thatifthe
equationF(x) = 0 has an integersolution,thenforanyintegern 2 1, the congruence
F(x) 0 (mod n) has a solution.The converse,in general,is false,evenforthecase of
one variable.A simplecounterexample is providedby F(x) = (2 x + 1)(3x + 1). To
showthat(2 x + 1)(3x + 1) 0 (mod n) has a solution, writen in theformn = 2'3bm
wheregcd(m,2) = gcd(m,3) = 1, and a and b are nonnegative integers. Then use the
Chinese RemainderTheorem.For more on the relationbetweencongruencesand
equationssee, e.g., [1]. Nevertheless
the followingis valid.
PROPOSITION 3. Let A E Mm, n(Z),and b E Zn. Thenthesystemoflinearequations
Ax = b has an integersolutionifand onlyifthecorresponding
systemofcongruences
Ax b (mod n) has a solutionfor everypositiveintegern.
Proof.Obviously,the firststatementimpliesthe second. Suppose the systemof
congruenceshas a solutionforeverypositiveintegern. Let L, R, D, y and c be as in
Proposition 2, and let N E Z be suchthatthetransition fromAx = b to Dy = c uses
integerswithabsolutevaluessmallerthan N. Then foreveryn 2 N, Ax b (mod n)
Dy c (mod n) di yi ci (mod n), i = 1. s. The lattersystemof congru-
ences is solvablein particularwhen n is a multipleof d,. Since diId, foreveryi,
1 < i < s, thisimpliesdil(diyi- ci), hence diIci forall i = 1,. . ., s. ThereforeDy = c
has an integersolution,and so does Ax = b. N
The following
statement
allowsone easilyto computetheindexofa subgroupofthe
additivegroupZ", whenthe indexis finite.
PROPOSITION4. Letf: Zn-Z map and A E Mn,
t" be a ZZ-linear
n(Z) be itsmatrix
withrespectto somechoiceofbases. SupposeA has rankn. Thentheindexoff(7Zn)in
7n is equal to IdetAl.
Proof.By Theorem 1 we can findtwo unimodularmatricesL and R such that
LAR=D= diag(dl,d2,...,dd) Since A is of rank n, all di2 0. Thereforethe
abelian group f(Zn) _ d7Z @ d27Z@ @ dn, and the order of Zn/f(Zn) is
d1d2... dnl= IdetDl. Since L and R are unimodular,
IdetDI = IdetAl. U
EXAMPLE.Let f:Z2 _ Z2 be definedby f((x, y)) = (28x + 38y,12x+ 16y).
Choosingbothbases to be the standardbasis Of 2,we get A= (38 16 . Therefore

This content downloaded from 143.207.2.50 on Mon, 29 Jul 2013 02:22:10 AM


All use subject to JSTOR Terms and Conditions
VOL. 69, NO. 4, OCTOBER 1996 265
the index [Z2: f(Z2)] is equal to IdetAl= 8. The Smith normalformof A is
D = 2)0 I hence f(Z) = 22 4Z.

Our nextapplicationis relatedto Proposition4. It deals withsome basic notionsof


the geometricnumbertheory.Let R denote the fieldof real numbers,and S =
{SI,... , sm) be a linearlyindependentset of vectorsin R'. The additivesubgroup
L = (S> of Rn generatedby S is called the latticegeneratedby S. A fundamental
donain T = T(S) of the latticeL is definedas

T=( E xis: 0 < xi< 1, xiE R}

The volumev(T) of T is definedin the usual way,as the squarerootof the absolute
value of the determinant of an m x m matrix
whose i-throwis the coordinatevector
of si in the standardbasis.ThoughT itselfdependson a particularset of generators
of L, thevolumeof T does not!
PROPOSITION5. Let S = {sl,.... Sm) and U= {ul,... ,uj be twosetsof linearly
independentvectorswhichgeneratethe same latticeL. Then m = t and v(T(S)) =
v(T(U)).
Proof We leave it to thereader.In case ofdifficulties,
lookthrough
[13, pp. 30-33
and pp. 168-1691. l

If one considersA withentriesfroma field,thenby elementary operationsof rows


and columns,A can be broughtto a diagonalform.It is a trivialexerciseto checkthat
an elementaryrow (column) operationpreservesthe dimensionsof both row and
columnspaces of A. ThereforematricesLAR and A have equal dimensionsof their
rowspaces and equal dimensionsof theircolumnspaces. Since thedimensions of row
space and column space fora diagonalmatrixare equal, we have a proofof the
following fundamentalresult.
PROPOSITION6. Thedimension oftherowspace ofa matrixwithentriesfrom
afield
is equal to thedimensionofitscolumnspace. a

Acknowledgement. References[31,[111,and remarksconcerningthe algorithmic aspectsof findingthe


Smithnormalformofan integermatrix werekdndlysuggestedto theauthorbyan anonymous referee.I am
also verygrateful
to GaryEbert,Todd Powers,David Saunders,AndrewWoldar,the editor,and referees,
whose numerouscommentssubstantiallyimprovedthe originalversionof thispaper.

REFERE NCES
1. Z. I. Borevichand I. R. Sharfarevitch,
Numnber Theory,AcademicPress,1966.
2. N. Bourbaki,Elementsof Mathematics: AlgebraI, Chapters1-3, Hermann,Paris,1974;
N. Bourbaki, Elementsof Mathematics:Algebra II, Chapters 4-7, Springer-Verlag, Berlin and
New York,1989.
3. T.-W. J. Chou and G. E. Collins, Algorithms for the solutionof systemsof linear Diophantine
equations,SIAM J. Computing11 (1982), 687-708.
4. L. E. Dickson,Historyof the Theoryof Numbers,Volume2, G. E. Stechert& Co., New York,1934.
5. G. Frobenius,Theorie der linearenFormenmitganzen Coefficienten, Jour.pir Math.,86 (1878),
146-208.
6. G. Frobenius und L. Stickelberger, Ober Gruppenvon Vertauschbaren Elementen,J. de Crelle
LXXXVI,(1879), 217.
7. M. Hall, Jr.,ComnbinatorialTheory,Second Ed., JohnWiley& Sons,New York,1986.
8. N. Jacobson,Basic AlgebraI, W. H. Freemanand Co., San Francisco,1974.

This content downloaded from 143.207.2.50 on Mon, 29 Jul 2013 02:22:10 AM


All use subject to JSTOR Terms and Conditions
266 MATHEMATICSMAGAZINE
9. I. Heger,Denkschriften Acad. Wiss.Wien (Math. Nat.), 14 II (1858), 1-122.
10. Ch. Hermite,(Euvres,t. I, Gauthier-Villars,
Paris,1905.
11. R. Kannanand A. Bachem,Polynomialtimealgorithms to computeHermiteand Smithnormalforms
of an integermatrix,SIAM J. Computing,8 (1979), 499-507.
12. H. J. S. Smith,On systemsof linearindeterminate equationsand congruences,p. 367, in Collected
MathematicalPapers,vol. I, 367-409, Oxford,1894. (= Phil. Trans.London,151 (1861), 293-326).
13. I. N. Stewartand D. 0. Tall, AlgebraicNumberTheory,Second Edition,Chapmanand Hall, New
York,1987.
14. B. L. van der Waerden,Algebra,Volume2, FrederickUngarPublishingCo., New York,1970.

TheGoldenRatiois LessThan7T2/6
JAMESD. HARPER
CentralWashingtonUniversity
WA 98926
Ellensburg,

As a mathematics teacher,I am pleased whenan exampleturnsout particularly neat


and tidy.Occasionally,as a bonus,the examplerevealsan unexpectedrelationship.
The relationshipin the titleof thisnote is not as unexpectedor striking
as, say,the
implications
of the Riemannhypothesis. Indeed, a hand-heldcalculatorwillconvince
anyonethe statement is true.Whatis unexpected,beyondtheserendipity of discover-
ingthisinequalityas I workedout an examplefora graduateanalysisclass,is thatthe
proofcenterson the Cauchy-Schwarz inequality.
the golden ratio,4, is the largerroot of the equation xi - x = 1;
Algebraically,
numerically,4 = (1 + v5 )/2 z 1.6180. The othernumberin the titleis, as Euler
discovered,the sum of the squares of the harmonicsequence: 1/12+1/22+
1/32 + ....
Recall thatthe space of all square-summable
real sequences is an innerproduct
space withthe usual "dot product":

( xl, x2, X3,... -) *( Yl, Y2, Y3, ) Xl Yl + X2Y2 + X3 Y3 + *

The Cauchy-Schwarzinequalityguaranteesthat this innerproductexists:For all


vectorsX and Y, (X.y)2 occursifandonlyifonevectoris a
equality
< IIXIII Y1I12;
scalarmultipleof the other.
My example begins with the harmonicsequence X = (1, 1/2,1/3,...) and its
cousin Y = (1/2,1/3,1/4,. .. ). Both sequences are square-summable, withrespec-
tivesums S = 7r2/6 and S - 1. Now,by the Cauchy-Schwarz inequality,
(00
1 2

I
(E n(n + ) = I12 = S(S -1).
<11X11211y112

The series on the leftis the classic telescopingexample,with sum 1. Therefore,


12< S(S - 1), and completingthe square gives: 5/4 <(S - 1/2)2. The desired
inequality:(1 + V5T)/2< 7r2/6 now followsimmediately.
Anothersurpriseis how close these two numbersare to each other; to four
decimals,4 = 1.6180< 1.6449= 7r2/6. Althoughour sequencevectorsare notequal,
theyare "almost"equal in thatthe limitof the ratioof theirtermsis 1.

This content downloaded from 143.207.2.50 on Mon, 29 Jul 2013 02:22:10 AM


All use subject to JSTOR Terms and Conditions

You might also like