How To Write Mathematics
How To Write Mathematics
2. It enables you to understand your ideas, six months later, when you can’t remember
what you were thinking when you wrote them.
Consequences: ‘It follows that...’, ‘Thus...’, ‘Therefore....’, ‘But then....’, ‘...so...’, etc.
Equalities
Many novices have a bad habit of ‘proving’ an equality by writing the equality first, and
then ‘working backwards’. While this is sometimes an effective problem-solving strategy,
it is not a correct proof. It gives the reader the impression that you are assuming what
you are trying to prove. For example, suppose we want to prove the trigonometric identity
sec2 (θ) = 1 + tan2 (θ)....
Good:
Bad:
1
sec2 (θ) =
cos2 (θ)
sec2 (θ) = 1 + tan2 (θ)
cos2 (θ) + sin2 (θ)
1 sin2 (θ) =
= 1+ cos2 (θ)
cos2 (θ) cos2 (θ)
cos2 (θ) sin2 (θ)
1 cos2 (θ) sin2 (θ) = +
= + cos2 (θ) cos2 (θ)
cos2 (θ) cos2 (θ) cos2 (θ)
sin2 (θ)
1 cos2 (θ) + sin2 (θ) = 1+
= cos2 (θ)
cos2 (θ) cos2 (θ)
= 1 + tan2 (θ)
1 = cos2 (θ) + sin2 (θ)
1 = 1
Modules
A mathematical proof should be broken down into distinct modules, each of which solves
a particular problem or accomplishes a particular goal. These modules are comparable to
subroutines within a computer program.
Each module should begin by clearly stating its goal. Avoid the ‘abracadabra’ writing
style, where you first present a confusing mass of technicalities, and then at the end, explain
what it was all about. For example:
Bad: Good:
bleah bleah bleah bleah bleah bleah bleah bleah I claim that that all frobnitzes have the type II
bleah bleah bleah bleah bleah bleah bleah bleah Siegel property. To see this, observe that bleah
bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah
bleah bleah bleah bleah bleah bleah... Thus we bleah bleah bleah bleah bleah bleah bleah bleah
see that all frobnitzes have the type II Siegel bleah bleah bleah bleah bleah bleah bleah bleah
property. bleah bleah bleah bleah bleah.
A module should have a logical development like a proper English essay: distinct paragraphs,
each developing a distinct idea, and each logically flowing into the next.
Often, a proof has a hierarchical structure, with submodules nested within modules. This
hierarchical structure should be explicitly visible in the page layout. For example:
Bad:
I claim that that all frobnitzes have the type II Siegel property. To see this, first we must show
that frobnitzes are spurling. Bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah
bleah bleah bleah bleah bleah bleah bleah. Next I claim that spurling implies the bi-infinite receding
foobaz condition. bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah
bleah bleah bleah bleah bleah bleah bleah. Finally, note that the foobaz condition implies the Siegel
property: bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah
bleah bleah bleah bleah bleah.
Good:
Claim 1: All frobnitzes have the type II Siegel property.
Proof:
Claim 1.1: Frobnitzes are spurling.
Proof: Bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah
bleah bleah bleah bleah bleah bleah bleah bleah. .......................... 2 [Claim 1.1]
Proof: bleah bleah bleah bleah bleah bleah bleah bleah bleah bleahbleah bleah bleah bleah
bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah. .............. 2 [Claim 1.2]
Proof: bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah bleah
bleah bleah bleah bleah bleah bleah. ..................................... 2 [Claim 1.3]
From Claims 1.1, 1.2, and 1.3, we conclude that frobnitzes have the Siegel property 2 [Claim 1]
Intuition pumps
At the beginning, sketch out the strategy of your proof in broad, intuitive terms. This sketch
need not be rigorous, precise, or even mathematically correct (though you must clearly
acknowledge where you bend the truth). The sketch should involve as little notation as
possible. It should provide the reader with a rough ‘mental framework’, upon which to
‘attach’ the technicalities of of the proof. Generally, there are only two reasons why you
would be unwilling to provide such a sketch:
1. You don’t want to reveal your ‘secret insight’, which enabled you to solve the problem.
2. You don’t really understand what you’ve done... you just cobbled together a bunch of
machinery, and it all works, somehow.
Notation Good notation is crucial for effective communication, and depends upon three
principles: Simplicity, Mimesis, and Consistency.
(A) Notation which is technically correct, but horribly complex and confusing.
(B) Notation which is technically incorrect, but much simpler to read, and whose meaning
is obvious to anyone with half a brain.
In such a situation, you should always choose (B), but you must explicitly point out that
you are using a ‘technically incorrect’ notation. This is sometimes called ‘abusing notation’.
Some common examples:
• If X and Y are two sets (groups, topological spaces, etc.) and f : X−→Y is an injection
that embeds X as a subset (subgroup, subspace, etc.) of Y, then we sometimes identify
X with its image f (X), and identify each point x ∈ X with its image f (x) ∈ f (X).
1. Use a specific font to denote objects of a particular type. For example, in linear alge-
bra, you might denote vectors by bold-faced lower-case letters (u, v, w, . . .), matrices
by bold-faced upper-case letters (A, B, C, . . .), scalars by lower-case roman letters
(r, s, t, . . .), and vector (sub)spaces by upper-case ‘blackboard’ font (U, V, W, . . .).
2. Use letters which stand for descriptive words. Hence, ‘f ’ stands for ‘function’; ‘n’
means ‘number’; ‘p’ means ‘prime’; ‘S’ stands for ‘set’; ‘G’ stands for ‘group’, etc.
3. Use alphabetically consecutive letters to denote similar objects. For example, if one
function is called ‘f ’, then the next two functions could be ‘g’ and ‘h’.
Problem: One alphabetical sequence may run into another. For example, if functions
are f, g, h, . . ., and indexes are i, j, k, . . ., then one can’t represent more than three
different functions.
4. Use letters from the same lexicographical family to denote objects which ‘belong’ to-
gether. For example:
Bad: Good:
For all r ∈ [1..n] and q ∈ [1..m], let For all n ∈ [1..N ] and m ∈ [1..M ], let
Xk X̀ XJ X K
A(r, q) = aij (r, q). A(n, m) = ajk (n, m).
i=1 j=1 j=1 k=1
Letters from the same lexicographical family should not be used to excess. For example,
a proof involving vectors v, v(1) , v(2) , v(3) , v e(2) , v0 , v
e(1) , v
e, v e0 , v
b, and v can become very
confusing; it would be better to use distinct letters u, v, w, . . ..
III. Notational Consistency: The same notational conventions should be used through-
out the text. Decide at the outset what conventions you will use, and stick to them. For
example, if, on page 1, you use ‘f ’, ‘g’, and ‘h’ for functions, and ‘U’, ‘V’, and ‘W’ for open
sets, then on page 10, do not suddenly switch to ‘φ’, ‘ψ’, and ‘ξ’ for your functions and ‘X ’,
‘Y’, and ‘Z’ for sets.
Whenever possible, conform to the notational conventions established in prior literature.
If everyone else uses ‘φ’ for to mean a morphism and ‘K’ to mean its kernel, do not insist on
using ‘f ’ and ‘X’ instead, unless you have a very good reason. On the other hand, do not
slavishly adhere to stupid conventions that could clearly be improved.
Pictures
A common misconception is that pictures are ‘unmathematical’ or ‘nonrigorous’, and should
be replaced by symbolism whenever possible. In fact, the opposite is true. Most mathe-
matics is motivated by visual intuitions, and most mathematicians think visually. A page
of symbolism is an extremely inadequate substitute for a few good pictures. Math books
(A) (B) (C) (D) φ
X Aa G g A1 A2
B a1 2 eG
A B ψ1 χ ψ2
C
gK
x
K
f
b1 b2 φ ∼
φ
C H B1 B2
eH h
Figure 1:
should be filled with illustrations —at least one per page. They aren’t because publishers
are cheap.
Pictures can never replace a clear written exposition; a proof still needs words. But
pictures can always enhance the clarity of the exposition. Some areas of mathematics (eg.
calculus, linear algebra, geometry, topology) are explicitly geometric in nature, and the value
of pictures is obvious. But pictures are also useful in ‘abstract’ mathematics...
• ...to depict relationships between sets. For example, Figure 1A shows how x ∈ A ∩ B,
C ⊂ B \ A, and A ∪ B ⊂ X. This is called a Venn diagram.
• ...to depict functions. For example, Figure 1B shows that f is a function from A into
B, with image C. Also, f (a1 ) = b1 , and f (a2 ) = b2 .
• ...to depict algebraic structure, by treating groups, rings, etc. as metaphorical ‘vector
spaces’. For example, Figure 1C shows that φ : G−→H is a homomorphism with
kernel K. Here, φ(g) = h, and the preimage of h is the coset gK.
• ...to depict networks of functions between spaces. For example, Figure 1D depicts
functions φ : A1 −→A2 , ψ1 : A1 −→B1 , φe : B1 −→B2 , ψ2 : A2 −→B2 , and χ :
A1 −→B2 , such that ψ2 ◦ φ = χ = φe ◦ ψ1 . This is called a commuting diagram.