Discrete Mathematics - Balakrishnan and Viswanathan
Discrete Mathematics - Balakrishnan and Viswanathan
2 Combinatorics 47
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.2 Elementary Counting Ideas . . . . . . . . . . . . . . . . . . . . 48
2.2.1 Sum Rule . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.2.2 Product Rule . . . . . . . . . . . . . . . . . . . . . . . 49
2.3 Combinations and Permutations . . . . . . . . . . . . . . . . . 51
2.4 Stirling’s Formula . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.5 Examples in simple combinatorial reasoning . . . . . . . . . . 54
2.6 The Pigeon-Hole Principle . . . . . . . . . . . . . . . . . . . . 59
2.7 More Enumerations . . . . . . . . . . . . . . . . . . . . . . . . 62
2.7.1 Enumerating permutations with constrained repetitions 64
2.8 Ordered and Unordered Partitions . . . . . . . . . . . . . . . . 65
2.8.1 Enumerating the ordered partitions of a set . . . . . . 65
2.9 Combinatorial Identities . . . . . . . . . . . . . . . . . . . . . 68
2.10 The Binomial and the Multinomial Theorems . . . . . . . . . 71
i
CONTENTS ii
8 Cryptography 392
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
8.2 Some Classical Cryptosystem . . . . . . . . . . . . . . . . . . 393
8.2.1 Caesar Cryptosystem . . . . . . . . . . . . . . . . . . . 393
8.2.2 Affine Cryptosystem . . . . . . . . . . . . . . . . . . . 394
8.2.3 Private Key Cryptosystems . . . . . . . . . . . . . . . 396
8.2.4 Hacking an affine cryptosystem . . . . . . . . . . . . . 396
8.3 Encryption Using Matrices . . . . . . . . . . . . . . . . . . . . 399
8.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
8.5 Other Private Key Cryptosystems . . . . . . . . . . . . . . . . 402
8.5.1 Vigenere Cipher . . . . . . . . . . . . . . . . . . . . . . 402
8.5.2 The One-Time Pad . . . . . . . . . . . . . . . . . . . . 403
8.6 Public Key Cryptography . . . . . . . . . . . . . . . . . . . . 404
8.6.1 Working of Public Key Cryptosystems . . . . . . . . . 405
8.6.2 RSA Public Key Cryptosystem . . . . . . . . . . . . . 406
8.6.3 The ElGamal Public Key Cryptosystem . . . . . . . . 409
8.6.4 Description of ElGamal System . . . . . . . . . . . . . 410
8.7 Primality Testing . . . . . . . . . . . . . . . . . . . . . . . . . 411
8.7.1 Nontrivial Square Roots (mod n) . . . . . . . . . . . . 411
8.7.2 Prime Number Theorem . . . . . . . . . . . . . . . . . 412
8.7.3 Pseudoprimality Testing . . . . . . . . . . . . . . . . . 413
8.7.4 The Miller-Rabin Primality Testing Algorithm . . . . . 414
CONTENTS vi
1.1 Introduction
In this chapter, we recall some of the basic facts about sets, functions, rela-
tions and lattices. We are sure that the reader is already familiar with most
of these that are usually taught in high school algebra with the exception of
lattices. We also assume that the reader is familiar with the basics of real
and complex numbers.
If A and B are sets and A ⊆ B (that is A is a subset of B, and A may
be equal to B), then the complement of A in B is the set B \ A consisting of
all elements of B not belonging to A. The sets A and B are equal if A ⊆ B
and B ⊆ A.
Definition 1.1.1:
By a family of sets we mean an indexed collection of sets.
1
Chapter 1 Introduction: Sets, Functions and Relations 2
For instance, F = {Aα }α∈I is a family of sets. Here for each α ∈ I, there
exists a set Aα of F . Assume that each Aα is a subset of set X. Such a set
X certainly exists since we can take X = ∪ Aα . For each α ∈ I, denote by
α∈I
A′α the complement X/Aα of Aα in X. We then have the celebrated laws of
de Morgan.
(ii) ( ∩ Aα )′ = ∪ A′α .
α∈I α∈I
Definition 1.1.3:
A family {Aα }α∈I is called a disjoint-family of sets if whenever α ∈ I, β ∈ I
and α 6= β, we have Aα ∩ Aβ = φ.
For instance, if A1 = {1, 2}, A2 = {3, 4} and A3 = {5, 6, 7} then {Aα }α∈I ,
I = {1, 2, 3} is a disjoint-family of sets.
Chapter 1 Introduction: Sets, Functions and Relations 3
1.2 Functions
Definition 1.2.1:
A function (also called a map or mapping or a single-valued function) f :
A → B from a set A to a set B is a rule by which to each a ∈ A, there is
assigned a unique element f (a) ∈ B. f (a) is called the image of a under f .
Definition 1.2.2:
Two functions f : A → B and g : A → B are called equal if f (a) = g(a) for
each a ∈ A.
Definition 1.2.3:
If E is a subset of A, then the image of E under f : A → B is ∪ {f (a)}. It
a∈E
is denoted by f (E).
Definition 1.2.4:
A function f : A → B is one-to-one (or 1–1 or injective) if for a1 and a2 in
A, f (a1 ) = f (a2 ) implies that a1 = a2 .
hand if for the above sets A and B, f (a) denotes the age of the student a, f
is not 1–1.
Definition 1.2.5:
A function f : A → B is called onto (or surjective) if for each b ∈ B, there
exists at least one a ∈ A with f (a) = b (that is, the image f (A) = B).
For example, let A denote the set of integers Z and B, the set of even
integers. If f : A → B is defined by setting f (a) = 2a, then f : A → B is
onto. Again, if f : R → (set of non-negative reals) defined by f (x) = x2 ,
then f is onto but not 1–1.
Definition 1.2.6:
A function f : A → B is bijective (or is a bijection) if it is both 1–1 and
onto.
Definition 1.2.7:
Let f : A → B and g : B → C be functions. The composition of g with f ,
denoted by g ◦ f , is the function
g◦f : A→C
Definition 1.2.8:
Let f : A → B be a function. For F ⊆ B, the inverse image of F under f ,
denoted by f −1 (F ) is the set of all a ∈ A with f (a) ∈ F . In symbols:
f −1 (F ) = {a ∈ A : f (a) ∈ F }.
Consider the example under Definition 1.2.5. If F = {1, 2}, that is, if F
consists of the 1st and 2nd standards of the school, then f −1 (F ) is the set of
students who are either in the 1st standard or in the 2nd standard.
Theorem 1.2.9:
Let f : A → B, and X1 , X2 ⊆ A and Y1 , Y2 ⊆ B. Then the following
statements are true:
Proof. We prove (iv). The proofs of the other statements are similar.
So assume that a ∈ f −1 (Y1 ∩ Y2 ), where a ∈ A. Then f (a) ∈ Y1 ∩ Y2 , and
therefore, f (a) ∈ Y1 and f (a) ∈ Y2 . Hence a ∈ f −1 (Y1 ) and a ∈ f −1 (Y2 ), and
therefore, a ∈ f −1 (Y1 ) ∩ f −1 (Y2 ). The converse is proved just by retracing
the steps.
Chapter 1 Introduction: Sets, Functions and Relations 6
Note that, in general, we may not have equality in (ii). Here is an example
where equality does not hold good. Let A = {1, 2, 3, 4, 5} and B = {6, 7, 8}.
Let f (1) = f (2) = 6, f (3) = f (4) = 7, and f (5) = 8. Let X1 = {1, 2, 4} and
X2 = {2, 3, 5}. Then X1 ∩ X2 = {2}, and so, f (X1 ∩ X2 ) = {f (2)} = {6}.
However, f (X1 ) = {6, 7}, and f (X2 ) = {6, 7, 8}. Therefore f (X1 ) ∩ f (X2 ) =
{6, 7} 6= f (X1 ∩ X2 ).
We next define a family of elements and a sequence of elements in a set
X.
Definition 1.2.10:
A family {xi }i∈I of elements xi in a set X is a map x : I → X, where for
i ∈ I, x(i) = xi ∈ X. I is the indexing set of the family (In other words, for
each i ∈ I, there is an element xi ∈ X of the family).
Definition 1.2.11:
A sequence {xn }n∈N of elements of X is a map x : N → X. In other words, a
sequence in X is a family in X where the indexing set is the set N of natural
numbers. For example, {2, 4, 6, . . .} is the sequence of even positive integers.
Definition 1.3.1:
The Cartesian product X × Y of two (not necessarily distinct) sets X and
Y is the set of all ordered pairs (x, y), where x ∈ X and y ∈ Y . In symbols:
X × Y = {(x, y) : x ∈ X, y ∈ Y }.
In the ordered pair (x, y), the order of x and y is important whereas the
Chapter 1 Introduction: Sets, Functions and Relations 7
unordered pairs (x, y) and (y, x) are equal. As ordered pairs they are equal
if and only if x = y. For instance, the pairs (1, 2) and (2, 1) are not equal
as ordered pairs, while they are equal as unordered pairs.
Definition 1.3.2:
A relation R on a set X is a subset of the Cartesian product X × X.
Definition 1.3.3:
A relation R on a set X is an equivalence relation on X if
Example 1.3.4:(1) On the set N of positive integers, let aRb mean that a|b (a
is a divisor of b). Then R is reflexive and transitive but not symmetric.
Definition 1.3.6:
A partition P of a set X is a collection P of nonvoid subsets of X whose
union is X such that the intersection of any two distinct members of P is
empty.
Theorem 1.3.7:
Any equivalence relation R on a set X induces a partition on X in a natural
way.
Proof. As above, let [x] denote the class defined by x. We show that the
classes [x], x ∈ X, define a partition on X. First of all, each x of X belongs
to class [x] since (x.x) ∈ R. Hence
X = ∪ [x]
x∈X
We now show that if (x, y) 6∈ R, then [x] ∩ [y] = φ. Suppose on the contrary,
[x] ∩ [y] 6= φ. Let z ∈ [x] ∩ [y]. This means that z ∈ [x] and z ∈ [y];
hence (z, x) ∈ R and (z, y) ∈ R. This of course means that (x, z) ∈ R
and (z, y) ∈ R and hence by transitivity (x, y) ∈ R, a contradiction. Thus
{[x] : x ∈ X} forms a partition of X.
Example 1.3.8:
Chapter 1 Introduction: Sets, Functions and Relations 9
Note that [5]=[0] and so on. Then the collection {[0], [1], [2], [3], [4]} of
equivalence classes forms a partition of Z.
Definition 1.4.1:
Two sets are called equipotent if there exists a bijection between them.
Equivalently, if A and B are two sets then A is equivalent to B if there
exists a bijection φ : A → B from A onto B.
Let Nn denote the set {1, 2, . . . , n}. Nn is called the initial segment
defined with respect to n.
Definition 1.4.2:
A set S is finite if S is equipotent to Nn for some positive integer n; otherwise
S is called an infinite set.
Theorem 1.4.3:
Let S be a finite set and f : S → S. Then f is 1–1 iff f is onto.
Example 1.4.4:
We show by means of examples that the conclusions in Theorem 1.4.3 may
not be true if S is an infinite set.
First, take S = Z, the set of integers and f : Z → Z defined by f (a) = 2a.
Clearly f is 1–1 but not onto (the image of f being the set of even integers).
Next, let R be the set of real numbers, and let f : R → R be defined by
x−1 if x > 0
f (x) = 0 if x = 0
x + 1 if x < 0
Theorem 1.4.5:
The union of any two finite sets is finite.
Proof. First we show that the union of any two disjoint finite sets is finite.
Let S and T be any two finite sets of cardinalities n and m respectively. Then
S is equipotent to Nn and T equipotent to Nm = {1, 2, . . . , m}. Clearly T
is also equipotent to the set {n + 1, n + 2, . . . , n + m}. Hence S ∪ T is
equipotent to {1, . . . , n} ∪ {n + 1, . . . , n + m} = Nn+m . Hence S ∪ T is also
a finite set.
Chapter 1 Introduction: Sets, Functions and Relations 12
Corollary 1.4.6:
The union of any finite number of finite sets is finite.
In this section, we briefly discuss the cardinal numbers of sets. Recall Sec-
tion 1.4.
Definition 1.5.1:
A set A is equipotent to a set B if there exists a bijection f from A onto
B, and that equipotence between members of a collection of sets S is an
equivalence relation on S .
As mentioned before, the sets in the same equivalence class are said to
have the same cardinality or the cardinal number. Intuitively it must be
clear that equipotent sets have the same “number” of elements. The cardinal
number of any finite set is a positive integer, while the cardinal numbers of
infinite sets are denoted by certain symbols. The cardinal number of the
infinite set N (the set of positive integers) is denoted by ℵ0 (aleph not). ℵ is
Chapter 1 Introduction: Sets, Functions and Relations 13
Definition 1.5.2:
A set is called denumerable if it is equipotent to N (equivalently, if it has
cardinal number ℵ0 ). A set is countable if it is finite or denumerable. It
is uncountable if it is not countable (clearly, any uncountable set must be
infinite).
Lemma 1.5.3:
Every infinite set contains a denumerable subset.
Theorem 1.5.4:
A set is infinite iff it is equipotent to a proper subset of itself.
(so that φ(x1 ) = x2 , φ(x2 ) = x3 and so on) is a 1–1 map of X onto Y , and
therefore an equipotence (that is, a bijection). Thus X is equipotent to the
proper subset Y = X \ {x} of X.
Notation
Definition 1.5.5:
Let X and Y be any two sets. Then |X| ≤ |Y | iff there exists a 1–1 mapping
from X to Y .
Suppose we have |X| ≤ |Y |, and |Y | ≤ |X|. If X and Y are finite sets,
it is clear that X and Y have the number of elements, that is, |X| = |Y |.
The same result holds good even if X and Y are infinite sets. This result is
known as Schroder-Bernstein theorem.
Lemma 1.5.7:
Let A be a set and A1 and A2 be subsets of A such that A ⊇ A1 ⊇ A2 . If
|A| = |A2 |, then |A| = |A1 |.
Note that the bijection from A2 to A4 is given by the same map φ. In this
way, we get a sequence of sets
A ⊇ A1 ⊇ A2 ⊇ A3 ⊇ . . . (1.3)
A \ A1 = A2 \ A3 ,
A1 \ A2 = A3 \ A4 ,
A2 \ A3 = A4 \ A5 ,
and so on. (see Figure 1.1). Once again, the bijections are under the same
map φ. Let P = A ∩ A1 ∩ A2 ∩ . . .
A\A1 A2 \A3
φ
A A1 A3 A2
Figure 1.1:
A1 =(A1 \ A2 ) ∪ (A2 \ A3 ) ∪ . . . ∪ P,
We recall the definition of the power set of a given set from Section 1.4.
Definition 1.6.1:
The power set P(X) of a set X is the set of all subsets of X.
Chapter 1 Introduction: Sets, Functions and Relations 17
P(X) = φ, {1}, {2}, {3}, {1, 2}, {2, 3}, {3, 1}, {1, 2, 3} = X
The empty set φ and the whole set X, being subsets of X, are elements of
P(X). Now each subset S of X is uniquely defined by its characteristic
function χS : X → {0, 1} defined by
1 if s ∈ S
χS =
0 if s ∈
/ S.
S = {x ∈ X : f (x) = 1},
then f = χS .
Definition 1.6.2:
For sets X and Y , denote by Y X , the set of all functions f : X → Y .
Theorem 1.6.3:
|X| |P(X)| for each nonvoid set X.
Proof. Theorem 1.6.3, implies that there exists a 1–1 function from X to
P(X) but none from P(X) to X.
First of all, the mapping f : X → P(X) defined by f (x) = {x} ∈ P(X)
is clearly 1–1. Hence |X| ≤ |P(X)|. Next, suppose there exists a 1–1
map from P(X) to X. Then by Schroder–Bernstein theorem, there exists a
bijection g : P(X) → X. This means that for each element S of P(X), the
Chapter 1 Introduction: Sets, Functions and Relations 18
1.7 Exercises
(i) ∪ Mn .
n∈N
(ii) Mn ∩ Mm .
(iii) ∩ Mn .
n∈N
(iv) ∪ Mp .
p=a prime
Prove:
9. Does there exist a relation which is not reflexive but both symmetric and
transitive?
10. Let X be the set of all ordered pairs (a, b) of integers with b 6= 0. Set
(a, b) ∼ (c, d) in X iff ad = bc. Prove that ∼ is an equivalence relation on
X. What is the class to which (1,2) belongs?
Chapter 1 Introduction: Sets, Functions and Relations 20
Definition 1.8.1:
A relation R on a set X is called antisymmetric if, for a, b ∈ X, (a, b) ∈ R
and (b, a) ∈ R together imply that a = b.
For instance, the relation R defined on N, the set of natural numbers,
by setting that “(a, b) ∈ R iff a|b(a divides b)” is an antisymmetric rela-
tion. However, the same relation defined on Z⋆ = Z/{0}, the set of nonzero
integers, is not antisymmetric. For instance, 5|(−5) and (−5)|5 but 5 6= −5.
Definition 1.8.2:
A relation R on a set X is called a partial order on X if it is (i) Reflexive,
(ii) Antisymmetric and (iii) Transitive.
A partially ordered set is a set with a partial order defined on it
Examples
{1,2,3}
b
b{1,3}
{1,2}
b
b{2,3}
b b b
{1} {2} {3}
b
φ
Definition 1.8.3:
A partial order “≤” on X is a total order (or linear order) if for any two
elements a and b of X, either a ≤ b or b ≤ a holds.
For instance, if X = {1, 2, 3, 4} and “ ≤′′ is the usual “less than or equal
to” then (X, ≤) is a totally ordered set since any two elements of X are
Chapter 1 Introduction: Sets, Functions and Relations 22
4 b
b3
2 b
b1
b
φ
The Hasse diagrams of all lattices with five elements are given in Fig-
ure 1.4.
1b 1b
1b 1b
b b a b bb
1b
a b a b bb a b b
c
1b b b b b b
0 0 0 0 0 0
V11 V12 V13 V14 V24 V15
1b
1b 1b
b
c b c b
cb 1b
bb b b
b a b
a b bb a b c bb a b
b b b b
0 0 0 0
V25 V35 V45 V55
If S has at least two elements, then (P(S), ≤) is not a totally order set.
Indeed, if a, b ∈ S, then {a} and {b} are incomparable (under ⊆) elements
of P(S).
Definition 1.8.5:
Let (X, ≤) be a poset.
Clearly, the greatest element of a poset is a maximal element and the least
element a minimal element.
Chapter 1 Introduction: Sets, Functions and Relations 24
Example 1.8.6:
Let (X, ⊆) be the poset where X = {1}, {2}, {1, 2}, {2, 3}, {1, 2, 3} . In
X, {1}, {2} are minimal elements, {1, 2, 3} is the greatest element (and the
only maximal element) but there is no smallest element.
Definition 1.8.7:
Let (X, ≤) be a poset and Y ⊆ X
Example 1.8.8:
If X = [0, 1] and ≤ stands for the usual ordering in the reals, then 1 is the
supremum of X and 0 is the infimum of X. Instead, if we take X = (0, 1), X
has neither an infimum nor a supremum in X. Here we have taken Y = X.
However, if X = R and Y = (0, 1), then 1 and 0 are the supremum and
infimum of Y respectively. Note that the supremum and infimum of Y ,
namely, 1 and 0, do not belong to Y .
1.9 Lattices
Definition 1.9.1:
A lattice L = (L, ∧, ∨) is a nonempty set L together with two binary oper-
ations ∧ (called meet or intersection or product) and ∨ (called join or union
or sum) that satisfy the following axioms:
For all a, b, c ∈ L,
(L2 ) a ∧ (b ∧ c) = (a ∧ b) ∧ c; a ∨ (b ∨ c) = (a ∨ b) ∨ c,
(Associative law)
Now, by (L3 ), a ∨ (a ∧ a) = a,
Theorem 1.9.2:
The relation “a ≤ b iff a ∧ b = a” in a lattice (L, ∧, ∨), defines a partial order
on L.
Now
a ∧ c = (a ∧ b) ∧ c = a ∧ (b ∧ c) by(L2 )
= a ∧ b = a and hence a ≤ c.
Theorem 1.9.3:
Any partially ordered set (L, ≤) in which any two elements have an infimum
and a supremum in L is a lattice under the operations,
Examples of Lattices
For a nonvoid set S, P(S), ∩, ∪ is a lattice. Again for a positive integer
n, define Dn to be the set of divisors of n, and let a ≤ b in Dn mean that
a | b, that is, a is a divisor of b. Then a ∧ b = (a, b), the gcd of a and b, and
a ∨ b = [a, b], the lcm of a and b, and Dn , ∨, ∧ is a lattice (See Chapter 2
for the definitions of gcd and lcm ). For example, if n = 20, Fig. 1.5 gives
the Hasse diagram of the lattice D20 = {1, 2, 4, 5, 10, 20}. It has the least
element 1 and the greatest element 20.
Chapter 1 Introduction: Sets, Functions and Relations 27
20
10
5
4
2
1
Duality Principle
In any lattice (L, ∧, ∨), any formula or statement involving the operations ∧
and ∨ remains valid if we replace ∧ by ∨ and ∨ by ∧.
The statement got by the replacement is called “the dual statement” of
the original statement.
The validity of the duality principle lies in the fact that in the set of
axioms for a lattice, any axiom obtained by such a replacement is also an
axiom. Consequently, whenever we want to establish a statement and its
dual, it is enough to establish one of them. Note that the dual of the dual
statement is the original statement. For instance, the statement:
a ∨ (b ∧ c) = (a ∨ b) ∧ (a ∨ c)
a ∧ (b ∨ c) = (a ∧ b) ∨ (a ∧ c).
Chapter 1 Introduction: Sets, Functions and Relations 28
Definition 1.9.4:
A subset L′ of a lattice L = (L, ∧, ∨) is a sublattice of L if (L′ , ∧, ∨) is a
lattice.
a ≤ b iff a ∧ b = a
For example, let (L, ∩, ∪) be the lattice of all subsets of a vector space L
and S be the collection of all subspaces of L. Then S is, in general, not a
sublattice of L since the union of two subspaces of L need not be a subspace
of L.
Lemma 1.9.5:
In any lattice L = (L, ∧, ∨), the operations ∧ and ∨ are isotone, that is, for
a, b, c in L,
if b ≤ c, then a ∧ b ≤ a ∧ c and a ∨ b ≤ a ∨ c.
a ∧ b = a ∧ (b ∧ c) = (a ∧ b) ∧ c (by L2)
≤ a ∧ c (as a ∧ b ≤ a).
Lemma 1.9.6:
Any lattice satisfies the two distributive inequalities:
(ii) x ∨ (y ∧ z) ≤ (x ∨ y) ∧ (x ∨ z).
Proof. We have x∧y ≤ x, and x∧y ≤ y ≤ y∨z. Hence, x∧y ≤ inf(x, y∨z) =
x ∧ (y ∨ z), Also x ∧ z ≤ x, and x ∧ z ≤ z ≤ y ∨ z. Thus x ∧ z ≤ x ∧ (y ∨ z).
Therefore, x ∧ (y ∨ z) is an upper bound for both x ∧ y and x ∧ z and hence
greater than or equal to their least upper bound, namely, (x ∧ y) ∨ (x ∧ z).
The second statement follows by duality.
Lemma 1.9.7:
The elements of a lattice satisfy the modular inequality:
x≤z implies x ∨ (y ∧ z) ≤ (x ∨ y) ∧ z.
Aliter.
By Lemma 1.9.6 x ∨ (y ∧ z) ≤ (x ∨ y) ∧ (x ∨ z)
= (x ∨ y) ∧ z, as x ≤ z.
Two important classes of lattices are the distributive lattices and modular
lattices. We now define them.
Definition 1.9.8:
Chapter 1 Introduction: Sets, Functions and Relations 30
a ∧ (b ∨ c) = (a ∧ b) ∨ (a ∧ c),
and a ∨ (b ∧ c) = (a ∨ b) ∧ (a ∨ c)
Note that in view of the duality that is valid for lattices, if one of the
two distributive laws holds in L then the other would automatically remain
valid.
Example 1.9.9 (Examples of Distributive Lattices): (i) P(S), ∩, ∪
(ii) (N, gcd, lcm ). (Here a∧b = (a, b), the gcd of a and b, and a∨b = [a, b],
the lcm of a and b.)
1b 1b
c b
a b bc bb bb
a b
b b
0 0
Diamond Lattice Pentagonal Lattice
(a) (b)
a ∨ (b ∧ c) = a ∨ 0 = a, while
(a ∨ b) ∧ (a ∨ c) = 1 ∧ c = c(6= a).
Complemented Lattice
Definition 1.9.11:
A lattice L with 0 and 1 is complemented if for each element a ∈ L, there
exists at least one element b ∈ L such that
a ∧ b = 0 and a ∨ b = 1.
(3) Not every lattice with 0 and 1 is complemented. In the lattice of Fig-
ure 1.7 (b), a has no complement.
That the diamond lattice and the pentagonal lattice (of Figure 1.6) are
crucial in the study of distributive lattices is the content of Theorem 1.9.13.
Chapter 1 Introduction: Sets, Functions and Relations 32
1b 1b
bc a b bb
b b
b
ba
c
b
0 b0
(a) (b)
Figure 1.7:
Theorem 1.9.13:
A lattice is distributive iff it does not contain a sublattice isomorphic to the
diamond lattice or the pentagonal lattice.
The necessity of the condition in Theorem 1.9.13 is trivial but the proof
of sufficiency is more involved. (For a proof, see ????)
However, a much simpler result is the following:
Theorem 1.9.14:
If a lattice L is distributive then for a, b, c ∈ L, the equations a ∧ b = a ∧ c
and a ∨ b = a ∨ c together imply that b = c.
= b ∧ (a ∨ c) = (b ∧ a) ∨ (b ∧ c) (by distributivity)
= (a ∧ b) ∨ (b ∧ c) = (a ∧ c) ∨ (b ∧ c) = (a ∨ (b ∧ c)) ∧ (c ∨ (b ∧ c))
= c.
Modular Lattices
Definition 1.9.15:
A lattice is modular if it satisfies the following modular identity:
x ≤ z ⇒ x ∨ (y ∧ z) = (x ∨ y) ∧ z.
Hence the modular lattices are those lattices for which equality holds in
Lemma 1.9.7. We prove in Chapter 5 that the normal subgroups of any
group form a modular lattice.
The pentagonal lattice of Figure 1.7 is nonmodular since in it a ≤ c, a ∨
(b ∧ c) = a ∨ 0 = a while (a ∨ b) ∧ c = 1 ∧ c = c(6= a). In fact, the following
result is true.
Theorem 1.9.16:
Any nonmodular lattice L contains the pentagonal lattice as a sublattice.
a < c and a ∨ (b ∧ c) 6= (a ∨ b) ∧ c.
But the modular inequality (Lemma 1.9.7) holds for any lattice. Hence
a∧b
b
by
b b
bx
b
b∧y
Figure 1.8:
Theorem 1.9.17:
In a distributive lattice L, an element can have at most one complement.
x ∧ y1 = 0 = x ∧ y2 , and x ∨ y1 = 1 = x ∨ y2 .
1.10.1 Introduction
Definition 1.10.1:
A complemented distributive lattice is a Boolean algebra. Hence a Boolean
algebra B has the universal elements 0 and 1 and that every element x
of B has a complement x′ , and since B is a distributive lattice, by Theo-
rem 1.9.17, x′ is unique. The Boolean algebra B is symbolically represented
as (B, ∧, ∨, 0, 1,′ ).
Chapter 1 Introduction: Sets, Functions and Relations 36
2. Let B n denote the set of all binary sequences of length n. For (a1 , . . . , an )
and (b1 , . . . , bn ) ∈ B n , set
(a1 , . . . , an ) ∧ (b1 , . . . , bn ) = (min(a1 , b1 ), . . . , min(an , bn )),
(a1 , . . . , an ) ∨ (b1 , . . . , bn ) = (max(a1 , b1 ), . . . , max(an , bn )),
and (a1 , . . . , an )′ = (a′1 , . . . , a′n ), where 0′ = 1 and 1′ = 0.
Note that the zero element is the n-vector (0, 0, . . . , 0), and, the unit
element is (1, 1, . . . , 1). For instance, if n = 3, x = (1, 1, 0) and
y = (0, 1, 0), then x ∧ y = (0, 1, 0), x ∨ y = (1, 1, 0), and x′ = (0, 0, 1).
= ((a ∧ a′ ) ∨ (a ∧ b′ )) ∧ ((b ∧ a′ ) ∨ (b ∧ b′ ))
= (0 ∨ (a ∧ b′ )) ∧ ((b ∧ a′ ) ∨ 0)
= (a ∧ b′ ) ∧ (b ∧ a′ )
= (a ∧ a′ ∧ b) ∧ (b′ ∧ b ∧ a′ )
= (0 ∧ b) ∧ (0 ∧ a′ )
Chapter 1 Introduction: Sets, Functions and Relations 37
=0∧0
= 0 (since a ∧ a′ = 0 = b ∧ b′ ).
Similarly,
= (1 ∨ b′ ) ∧ (1 ∨ a′ ) = 1 ∧ 1 = 1.
Corollary 1.10.3:
In a Boolean algebra B, for a, b ∈ B,a ≤ b iff a′ ≥ b′
Proof. a ≤ b ⇔ a ∨ b = b ⇔ (a ∨ b)′ = a′ ∧ b′ = b′ ⇔ a′ ≥ b′ .
Theorem 1.10.4:
In a Boolean algebra B, we have for all a, b ∈ B,
a ≤ b iff a ∧ b′ = 0 iff a′ ∨ b = 1.
a = a ∧ 1 = a ∧ (b ∨ b′ ) = (a ∧ b) ∨ (a ∧ b′ )
= a ∧ b ⇒ a ≤ b.
Chapter 1 Introduction: Sets, Functions and Relations 38
Boolean Subalgebras
Definition 1.10.5:
A Boolean subalgebra of a Boolean algebra B = (B, ∧, ∨, 0, 1, ′ ) is a subset
B1 of B such that (B1 , ∧, ∨, 0, 1, ′ ) is itself a Boolean algebra with the same
elements 0 and 1 of B.
Boolean Isomorphisms
Definition 1.10.6:
A Boolean homomorphism from a Boolean algebra B1 to a Boolean algebra
B2 is a map f : B1 → B2 such that for all a, b in B1 ,
Theorem 1.10.7:
Let f : B1 → B2 be a Boolean homomorphism. Then
(ii) f is isotone.
Proof. Straightforward.
Example 1.10.8:
Let S = {1, 2, . . . , n}, and let A be the Boolean algebra (P(S) = A, ∩, ∪, ′ ),
and let B be the Boolean algebra defined by the set of all functions from S
to the set [0, 1]. Any such function is a sequence (x1 , . . . , xn ) where xi = 0
or 1. Let ∧, ∨ and be as in Example 2 of Section 1.10.1. Now consider the
map f : A = P(S) → B = [0, 1]S defined as follows: For X ⊂ S (that is,
X ∈ P(S) = A), f (X) = (x1 , x2 , . . . , xn ), where xi = 1 or 0 according to
whether i ∈ X or not. For X, Y ∈ P (S), f (X ∩ Y ) = the binary sequence
having 1 only in the places common to X and Y = f (X) ∧ f (Y ) as per the
definitions in Example 2 of Section 1.10.1 Similarly, f (X ∪ Y ) = the binary
sequence having 1 in all the places corresponding to the 1’s in the set X ∪ Y
= f (X) ∨ f (Y ).
Further, f (X ′ ) = f (S \ X) = the binary sequence having 1’s in the places
where X had zeros, and zeros in the places where X had 1’s = (f (X)′ ). f
is 1–1 since distinct binary sequences in B arise out of distinct subsets of
S. Finally, f is onto, since any binary sequence in B is the image of the
corresponding subset (that is, the subset corresponding to the places of the
sequence with 1) of X. Thus f is a Boolean isomorphism.
Example 1.10.9:
Let A be a proper Boolean subalgebra of B = P(S). Then if f : A → B is
Chapter 1 Introduction: Sets, Functions and Relations 40
Definition 1.11.1:
An element a of a lattice L with zero is called an atom of L if a 6= 0 and for
all b ∈ L, 0 < b ≤ a ⇒ b = a. That is to say, a is an atom if there is no
nonzero b strictly less than a.
Definition 1.11.2:
An element a of a lattice L is called join-irreducible if a = b ∨ c, then a = b
or a = c; otherwise, a is join-reducible.
Lemma 1.11.3:
Every atom of a lattice with zero is join-irreducible.
Lemma 1.11.4:
Let L be a distributive lattice and c ∈ L be join-irreducible. If c ≤ a ∨ b,
Chapter 1 Introduction: Sets, Functions and Relations 41
[a, b] = {x ∈ L : a ≤ x ≤ b}.
(ii) Let x ∈ [a, b]. x is said to be relatively complemented in [a, b], if x has
a complement y in [a, b], that is, x ∧ y = a and x ∨ y = b. If all intervals
[a, b] of L are complemented, then the lattice L is said to be relatively
complemented.
(iii) If L has a zero element and all elements in [0, b] have complements in L
for every nonzero b in L, then L is said to be sectionally complemented.
Our next theorem is crucial for the proof of the representation theorem
for finite Boolean algebras.
Theorem 1.11.6:
The following statements are true:
Proof. (i) Let [a, b] be an interval in a Boolean algebra B, and x ∈ [a, b].
We have to prove that [a, b] is complemented. Now, as B is the Boolean
algebra, it is a complemented lattice and hence there exists x′ in B such
that x ∧ x′ = 0, and x ∨ x′ = 1. Set y = b ∧ (a ∨ x′ ). Then y ∈ [a, b].
Also, y is a complement of x in [a, b] since
x ∧ y = x ∧ (b ∧ (a ∨ x′ ))
= (x ∧ a) ∨ (x ∧ x′ ) (as B is distributive)
= (x ∧ a) ∨ (0) = a (as a ≤ x)
and, x ∨ y = x ∨ (b ∧ (a ∨ x′ )) = (x ∨ b) ∧ (x ∨ (a ∨ x′ ))
= b ∧ (x ∨ x′ ) ∨ a (again by distributivity)
Corollary 1.11.7:
In any finite Boolean algebra, every nonzero element is a join of atoms.
We end this section with the representation theorem for finite Boolean
algebras which says that any finite Boolean algebra may be thought of as the
Boolean algebra P(S) defined on a finite set S.
′
⇔a / A(b) ⇔ a ∈ A \ A(b) = (A(b))′ . Thus A(b′ ) = A(b) .
b⇔a∈
and this would prove that φ is onto. Now ci ≤ b for each i, and so by the
definition of φ, φ(b) = { set of atoms c ∈ A with c ≤ b} ⊇ C. Conversely,
if a ∈ φ(b), then a is an atom with a ≤ b = c1 ∨ . . . ∨ ck . Therefore a ≤ ci
for some i by Lemma 1.11.4. As ci is an atom and a 6= 0, this means that
a = ci ∈ C. Thus φ(b) = C.
1.12 Exercises
1. Draw the Hasse diagram of all the 15 essentially distinct lattices with
six elements.
2. Show that the closed interval [a, b] is a sublattice of the lattice R, inf, sup .
7. Show that the three lattices of Fig. 1.9 are not distributive.
Chapter 1 Introduction: Sets, Functions and Relations 46
1
b
b
e 1b
db
cb d
b 1
b
b
b b
b bd
a b b ab c
b ab c
b
f bb
b b b
0 0 0
Figure 1.9:
9. Show that the lattice of all subspaces of a vector space is not distribu-
tive.
10. Which of the following lattices are (i) distributive (ii) modular (iii) mod-
ular, but not distributive? (a) D160 (b) D20 (c) D36 (d) D40 .
Combinatorics
2.1 Introduction
Combinatorics is the science (and to some extent, the art) of counting and
enumeration of configurations (it is understood that a configuration arises
every time objects are distributed according to certain predetermined con-
straints). Just as arithmetic deals with integers (with the standard opera-
tions), algebra deals with operations in general, analysis deals with functions,
geometry deals with rigid shapes and topology deals with continuity, so does
combinatorics deals with configurations. The word combinatorial was first
used in the modern mathematical sense by Gottfried Wilhelm Leibniz (1646–
1716) in his Dissertatio de Arte Combinatoria (Dissertation Concerning the
Combinatorial Arts). Reference to “combinatorial analysis” is found in En-
glish in 1818 in the title Essays on the Combinatorial Analysis by P. Nichol-
son (see Jeff Miller’s Earliest known uses of the words of Mathematics, Society
for Industrial and Applied Mathematics, U.S.A.). In his book [4], C. Berge
points out the following interesting aspects of combinatorics:
47
Chapter 2 Combinatorics 48
We begin with some simple ideas of counting using the Sum Rule, the Product
Rule and obtaining permutations and combinations of finite sets of objects.
Chapter 2 Combinatorics 49
Example 2.2.1:
Assume that a car registration system allows a registration plate to consist of
one, two or three English alphabets followed by a number (not zero or starting
with zero) having a number of digits equal to the number of alphabets. How
many possible registrations are there?
Example 2.2.2:
Consider tossing 100 indistinguishable dice. By the Product Rule, it follows
that there are 6100 ways of their falling.
Example 2.2.3:
A self-dual 2-valued Boolean function is one whose definition remains un-
changed if we change all the 0’s to 1’s and all the 1’s to 0’s simultaneously.
How many such functions in n-variables exist?
Chapter 2 Combinatorics 51
Values of Function
Variables Value
0 0 0 1
0 0 1 1
0 1 0 0
0 1 1 0
1 0 0 1
1 0 1 1
1 1 0 0
1 1 1 0
and a b c. This ignores the ordering of objects. On the other hand, the 3-
permutations are a a a, a a b, a b a, b a a, a a c, a c a, c a a, a b c, a c b, b a c, b c a,
c a b and c b a.
When we allow unlimited repetitions of objects we denote the repetition
number by α. Consider the 3-combinations possible from {α.a, α.b, α.c, α.d}.
There are 20 of them. On the other hand, there are 43 or 64 3-permutations.
We use the following notations.
P (n, r) = the number of r permutations of n distinct elements without rep-
etitions.
C(n, r) = the number of r combinations of n distinct elements without rep-
etitions.
n! n!
The following are basic results: P (n, r) = (n−r)!
and C(n, r) = r!(n−r)!
Example 2.4.1:
Chapter 2 Combinatorics 53
n n! Sn Percentage
error
8 40 320 39 902 1.0357
9 362 880 359 537 0.9213
10 3 628 800 3 598 696 0.8296
11 39 916 800 39 615 625 0.7545
12 479 001 600 475 687 486 0.6919
13 6 227 020 800 6 187 239 475 0.6389
Example 2.4.2:
Enumerating r-permutations of n objects with unlimited repetitions is easy.
We consider r boxes, each to be filled by one of n possibilities. Thus the
answer is U (n, r) = nr .
It is easy to see that C(n, r) can also be expressed as n(n−1)(n−2) · · · (n−
r + 1)/r!
The numerator is often denoted by [n]r which is a polynomial in n of
degree r. Thus we can write, [n]r = sor + s1r n + s2r n2 + · · · + srr nr .
By definition, the coefficients skr are the Stirling Numbers of the first kind.
Chapter 2 Combinatorics 54
s0r = 0, srr = 1
Proof. By definition, [x]r+1 = [xr ](x − r). Again by definition, we have from
the above equality, · · · + skr+1 xk + · · · = (· · · + srk−1 xk−1 + skr xk + · · · )(x − r)
Equating the coefficients of xk on both the sides above, gives the required
recurrence. From the above relations we can build the following table:
skr k=0 1 2 3 4
r=1 0 1 0 0 0
2 0 -1 1 0 0
3 0 2 -3 1 0
4 0 -6 11 -6 1
ing
Example 2.5.1:
Show that C(n, r) = C(n, n − r).
The left hand side of the equality denotes the number of ways of choosing
r objects from n objects. Each such choice leaves out (n − r) objects. This
is exactly equivalent to choosing (n − r) objects, leaving out r objects, which
is the right hand side.
Chapter 2 Combinatorics 55
Example 2.5.2:
Show that C(n, r) = C(n − 1, r − 1) + C(n − 1, r).
The left hand side is the number of ways of selecting r objects from out
of n objects. To do this, we proceed in a different manner. We mark one of
the n objects as X. In the selected r objects, (a) either X is included or (b)
X is excluded. The two cases (a) and (b) are mutually exclusive and totally
exhaustive. Case (a) is equivalent to selecting (r − 1) objects from (n − 1)
objects while case (b) is equivalent to selecting r objects from (n−1) objects.
Example 2.5.3:
There are a roads from city A to city B, b roads from city B to city C, c
roads from city C to city D, e roads from city A to city C, d roads from
city B to city D and f roads from city A to city D. In how many number
of ways one can travel from city A to city D and come back to city A while
visiting at least one of city B and/or city C at least once? Starting from the
city A, the different routes leading to city D are shown in the following “tree
diagram”.
It follows that the total number of ways of going to city D from city A is
(abc + ad + ec + f ). The tree diagram also suggests (from the leaves to the
root) the number of ways of going from city D to city A is (abc + ad + ec + f ).
Therefore, the total number of ways of going from city A to city D and back
is (abc + ad + ec + f )2 . The number of ways of directly going from city A to
city D and back to city A directly is f 2 . Hence the number of ways of going
from city A to city D and back while visiting city B and/or city C at least
once is (abc + ad + ec + f )2 − f 2 .
Chapter 2 Combinatorics 56
A
a f
B e
C D
b d
c
C D
c D
D
Figure 2.1:
Example 2.5.4:
Show that P (n, r) = r · P (n − 1, r − 1) + P (n − 1, r).
The left hand side is the number of ways of arranging r objects from out
of n objects. This can be done in the following way. Among the r objects, we
mark one object as X. The selected arrangement either includes X or does
not include X. In the former case, we first select (r − 1) objects from among
(n − 1) objects and then introduce X in any of the r positions. This gives
the first term on the right hand side. In the latter case, we simply select r
objects from out of (n − 1) objects excluding X. This gives the second term
on the right hand side.
Example 2.5.5:
Count the number of simple undirected graphs with a given set V of n ver-
tices.
Obviously, V contains C(n, 2) = n(n − 1)/2 unordered pairs of vertices.
We may include or exclude each pair as an edge in forming a graph with
vertex set V . Therefore, there are 2C(n,2) simple graphs with vertex set V .
Chapter 2 Combinatorics 57
Example 2.5.6:
Let S be a set of 2n distinct objects. A pairing of S is a partition of S into
2-element subsets; that is, a collection of pairwise disjoint 2-element subsets
whose union is S. How many different pairings of S are there?
(2n − 1) · (2n − 3) · · · 5 · 3 · 1.
1
[C(2n, 2) · C(2n − 2, 2) · C(2n − 4, 2) · · · C(4, 2) · C(2, 2)]
n!
This expression is the same as the one obtained in the method 1 above.
Example 2.5.7:
Let S = {1, 2, . . . , (n + 1)} where n ≥ 2 and let T = {(x, y, z) ∈ S 3 |x < z
and y < z}. Show by counting |T | in two different ways that,
X
k 2 = C(n + 1, 2) + 2C(n + 1, 3) . . . . . . (2.2)
1≤k≤n
Example 2.5.8:
A sequence of (mn + 1) distinct integers u1 , u2 , . . . , umn+1 is given. Show
that the sequence contains either a decreasing subsequence of length greater
than m or an increasing subsequence of length greater than n (this result is
due to P. Erdös and G. Szekeres (1935)).
We present the proof as in [4]. Let li (−) be the length of the longest
decreasing subsequence with the first term ui (−) and let li (+) be the length
of the longest increasing subsequence with the first term ui (+).
Assume that the result is false. Then ui → (li (−), li (+)) defines a
mapping of {u1 , u2 , . . . , umn+1 } into the Cartesian product {1, 2, . . . , m} ×
{1, 2, . . . , n}. This mapping is injective since if i < j,
ui > uj ⇒ li (−) > lj (−)⇒ (li (−), li (+)) 6= (lj (−), lj (+))
ui < uj ⇒ li (+) > lj (+)⇒ (li (−), li (+)) 6= (lj (−), lj (+))
Dirichlet in the year 1834 although he apparently used the German term
Schubfachprinzip. The French term is le principe de tiroirs de Dirichlet which
can be translated as “the principle of the drawers of Dirichlet”.
The Pigeon-Hole Principle: If n objects are put into m boxes and n > m
(m and n are positive integers) then at least one box contains two or more
objects.
A stronger form: If n objects are put into m boxes and n > m, then some
box must contain at least [n/m] objects.
Another form: Let k and n be the two positive integers. If at least kn + 1
objects are distributed among n boxes then one of the boxes must contain
at least k + 1 objects.
We now illustrate this principle with some examples.
Example 2.6.1:
Show that, among a group of 7 people there must be at least four of the same
sex.
Example 2.6.2:
Given any five points, chosen within a square of side with length 2 units,
√
prove there must be two points which are at most 2 units apart.
Chapter 2 Combinatorics 61
Subdivide the square into four small squares each with side of length 1
unit. By the pigeon-hole principle at least two of the chosen points must be
in (or on the boundary) one small square. But then the distance between
√
these two points cannot exceed the diagonal length, 2 of the small square.
Example 2.6.3:
Let A be a set of m positive integers. Show that there exists a nonempty
P
subset B of A such that the sum is divisible by m.
x∈B
...............
If any of the sums is exactly divisible by m, then the corresponding set is the
required subset B. Therefore, we will assume that none of the above sums is
divisible by m. We thus have,
a1 ≡ r1 ( mod m)
a1 + a2 ≡ r2 ( mod m)
a1 + a2 + a3 ≡ r3 ( mod m)
Chapter 2 Combinatorics 62
a1 + a2 + · · · + am ≡ rm ( mod m)
a1 + a2 + · · · + ai ≡ r ( mod m)
Example 2.7.1:
The number of integral solutions of, x1 + x2 + · · · + xn = r, xi > 0, for all
admissible values of i is equal to the number of ways of distributing r similar
balls into n numbered bins with at least one ball in each bin. This is equal
to C(n − 1 + (r − n), r − n) = C(r − 1, r − n).
Chapter 2 Combinatorics 64
P (n; q1 , q2 , . . . , qt ) = n!/(q1 ! · q2 ! · . . . · qt !) =
By substituting the formula for each term in the product, the last expression
can be simplified to the previous expression.
Chapter 2 Combinatorics 65
S = A1 ∪ A2 ∪ · · · ∪ At ,
and Ai ∩ Aj = φ, for i 6= j.
({a}, {b}, {c, d}) ({b}, {a}, {c, d}) ({a}, {c}, {b, d}) ({c}, {a}, {b, d})
({a}, {d}, {b, c}) ({d}, {a}, {b, c}) ({b}, {c}, {a, d}) ({c}, {b}, {a, d})
({b}, {d}, {a, c}) ({d}, {b}, {a, c}) ({c}, {d}, {a, b}) ({d}, {c}, {a, b})
Here, our concern is in the number of such partitions rather than the actual
list itself.
We see this by choosing the q1 elements to occupy the first subset in C(n, q1 )
ways; the q2 elements for the second subset in C(n − q1 , q2 ) ways etc. Thus,
Chapter 2 Combinatorics 66
Example 2.8.1:
In the game of bridge, the four players N , E, S and W are seated in a
specified order and are each dealt with a hand of 13 cards. In how many
ways can the 52 cards be dealt to the four players?
We see that the order counts. Therefore, the number of ways is 52!/(13!)4 .
Example 2.8.2:
To show that (n2 )!/(n!)n is an integer.
Consider a set of n2 elements. Partition this set into n-part partitions.
Then the number of ordered n-part partitions is (n2 )!/(n!)n which has to be
an integer.
({a}, {b, c, d}) , ({b}, {a, c, d}) , ({c}, {a, b, d}) , ({d}, {a, b, c}) ,
({a, b}, {c, d}) , ({a, c}, {b, d}) , ({a, d}, {b, c}) .
Sn1 = Snn = 1.
k
Also, Sn+1 = Snk−1 + kSnk , 1<k<n
(i) The (n+1)th object is the sole member of a class: In this case, we simply
form the partitions of the remaining n objects into k − 1 classes and
attach the class containing the sole member. The number of partitions
thus formed is Snk−1 .
(ii) The (n+1)th object is not the sole member of any class: In this case, we
first form the partitions of the remaining n objects into k classes. This
gives Snk partitions. In each such partition we then add the (n + 1)th
object to one of the k classes. We thus get kSnk partitions of the required
type.
Newton’s Identity
C(0, 0) = 1
C(1, 0) = 1 C(1, 1) = 1
Figure 2.2:
Diagonal Summation:
Row Summation
orems
Theorem 2.10.1:
Let n be a positive integer. Then all elements x and y belonging to a com-
mutative ring with unit element with the usual operations + and ·,
(x + y)n = C(n, 0)xn + C(n, 1)xn−1 y + C(n, 2)xn−2 y 2 + · · · + C(n, r)xn−r y r +
· · · + C(n, n)y n
Note: For the definition of a commutative ring, see Chapter 5. For the
present, it is enough to think x and y as real numbers.
(x + y)(x + y) · · · (x + y)
The binomial coefficients (of the type C(n, r)) appearing above occur in
the Pascal’s triangle. For a fixed n, we can obtain the ratio of the (k + 1)st
bionomial coefficient of order n to the k th : C(n, k+1)/C(n, k) = (n−k)/(k+
1).
Chapter 2 Combinatorics 72
This ratio is larger than 1 if k < (n−1)/2 and is less than 1 if k > (n−1)/2.
Therefore, we can infer that the biggest binomial coefficient must occur in the
“middle”. We use Stirling’s approximation to estimate how big the binomial
coefficients are:
p
n! (n/e)n (2nπ) p
C(n, n/2) = = p = 2n (2/nπ)
(n/2)! {(n/2e)n/2 (nπ)}2
Corollary 2.10.2:
Using the Binomial Theorem we can get expansions for (1 + x)n and (1 − x)n .
Let S be the common sum. Then, by previous identity (see row summation),
adding the two series, we get 2S = 2n or S = 2n−1 . The combinatorial
interpretation is easy. If S is a set with n elements, then the number of
subsets of S with an even number of elements is equal to number of subsets
of S with an odd number of elements and each of these is equal to 2n−1 .
Example 2.10.3:
To show that: 1 · C(n, 1) + 2 · C(n, 2) + 3 · C(n, 3) + · · · + n · C(n, n) = n2n−1 ,
for each positive integer n.
Here, the role of the binomial coefficients gets replaced by the “multinomial
coefficients”
n!
P (n; q1 , q2 , . . . , qt ) =
q1 !q2 ! · · · qt !
P
where qi ’s are non-negative integers and qi = n. (recall that the multino-
mial coefficients enumerate the ordered partitions of a set of n elements of
the type (q1 , q2 , . . . , qt )).
Example 2.10.4:
By long multiplication we can get, (x1 + x2 + x3 )3 = x31 + x32 + x33 + 3x21 x2
+ 3x21 x3 + 3x1 x22 + 3x1 x23 + 3x2 x23 + 3x22 x3 + 6x1 x2 x3 .
To get the coefficient of, say, x2 x23 we choose x2 from one of the factors
and x3 from the remaining two. This can be done in C(3, 1) · C(2, 2) = 3
ways; therefore the required coefficient should be 3.
Example 2.10.5:
Find the coefficient of x41 x52 x63 x34 in (x1 + x2 + x3 + x4 )18 .
The product will occur as often as, x1 can be chosen from 4 out of the
18 factors, x2 from 5 out of the remaining 14 factors, x3 from 6 out of the
remaining 9 factors and x4 from out of the last 3 factors. Therefore the
Chapter 2 Combinatorics 74
To count the number of terms in the above expansion, we note that each
term of the form xq1 q2 qt
1 , x2 , . . . , xt corresponds to a selection of n objects
with repetitions from t distinct types. There are C(n+t−1, n) ways of doing
this. This then is the number of terms in the above expansion.
Example 2.10.7:
In (x1 + x2 + x3 + x4 +x5 )10 , the coefficient of x21 x3 x34 x45 is
There are C(10+5−1, 10) = C(14, 10) = 1001 terms in the above multinomial
expansion.
Corollary 2.10.8:
In the multinomial theorem if we let x1 = x2 = · · · = xt = 1, then for any
P
positive interger t, we have tn = P (n; q1 , q2 , . . . , qt ), where the summation
P
extends over all sets of non-negative integers q1 , q2 , . . . , qt with qi = n.
Chapter 2 Combinatorics 75
The Sum Rule stated earlier (see section 1.2 ) applies only to disjoint sets.
A generalization is the Inclusion- Exclusion principle which applies to non-
disjoints sets as well.
We first consider the case of two sets. If A and B are finite subsets of
some universe U , then
|A ∪ B| = |A| + |B| − |A ∩ B|
|A ∪ B| = |A ∩ B ′ | + |A ∩ B| + |A′ ∩ B| (2.3)
Also, we have
|A| = |A ∩ B ′ | + |A ∩ B|
(2.5)
Example 2.11.1:
From a group of ten professors, in how many ways can a committee of five
members can be formed so that at least one of professor A or professor B is
included?
Chapter 2 Combinatorics 76
Theorem 2.11.2:
If A1 , A2 , . . . , an are finite subsets of a universal set then,
X X X
|A1 ∪ A2 ∪ · · · ∪ An | = |Ai | − |Ai ∩ Aj | + |Ai ∩ Aj ∩ Ak | − · · ·
– the second summation on the right-hand side is taken over all the 2-
combinations (i, j) of the integers {1, 2, ..., n}; the third summation is taken
over all the 3-combinations (i, j, k) of the integers {1, 2, ..., n} and so on.
Thus, for n = 4, there are 4 + C(4, 2) + C(4, 3) + 1 = 24 − 1 = 15 terms on
the right-hand side. In general there are,
Proof. The proof by induction is boring! Here we give the proof based on
combinatorial arguments.
We must show that every element of A1 ∪ A2 ∪ · · · ∪ An is counted exactly
once in the right hand side of (2.6). Suppose that an element x ∈ A1 ∪
A2 ∪ · · · ∪ An is in exactly m (integer, ≥ 1) of the sets considered on the
right-hand side of (2.6); for definiteness, say
x ∈ A1 , x ∈ A2 , . . . , x ∈ A m and x ∈
/ Am+1 , . . . , x ∈
/ An .
number of times. Now, we must show that, this last expression is 1. Ex-
panding (m − 1)m by the Binomial Theorem we get,
Using the fact that C(m, 0) = 1 and transposing all other terms to the left-
hand side of the above equation, we get the required relation.
Chapter 2 Combinatorics 78
of Eratosthenes
|A1 ∩ A2 ∩ A3 ∩ A4 | = 1000/210 =4
Then, |A1 ∪ A2 ∪ A3 ∪ A4 | =
(500+333+200+142)−(166+100+71+66+47+28)+(33+23+14+9)−4 = 772.
2.14 Derangements
(1, 2, . . . , k, bk+1 , . . . , bn )
Dn = (n − 1)(Dn−1 + Dn−2 )
n = α1 + α2 + · · · + αm and specify α1 ≥ α2 ≥ · · · ≥ αm ≥1
φ(α1 , α2 , . . . , αm ) = (α1 + 1, α2 + 1, . . . , αm + 1, 1, 1, . . . , 1)
The equations (2.7) and (2.8) allow us to compute p(n, m)’s recursively.
For example, if n = 4 and m = 6 the values of p(n, m)’s for n ≤ 4 and m ≤ 6
are given by the following array:
p(n, m) m = 1 2 3 4 5 6
n=1 1 0 0 0 0 0
2 1 1 0 0 0 0
3 1 1 1 0 0 0
4 1 2 1 1 0 0
5 1 2 2 1 1 0
6 1 3 3 2 1 1
x x x x x
x x x
5+3+3+1+1→ x x x →9+3+1
x
x
2.16.1 Proposition
2.16.2 Proposition
2.16.3 Proposition
Example 2.17.1:
The number Dn of derangements of the integers (1, 2, . . . , n) as we have seen
in section *** satisfies the recurrence relation, Dn − nDn−1 = (−1)n , for
n ≥ 2 with D1 = 0. The easiest way to solve this recurrence relation is to
rewrite it as,
Dn Dn−1 (−1)n
− =
n! (n − 1)! n!
which is easy to solve.
Chapter 2 Combinatorics 87
Example 2.17.2:
The sorting problem asks for an algorithm that takes as input a list or an
array of n integers and that sorts them that is, arranges them in nondecreas-
ing (or nonincreasing) order. One algorithm, call it procedure Mergesort(n),
does this by splitting the given list of n integers into two sublists of ⌊n/2⌋ and
⌈n/2⌉ integers, applies the procedure on the sublists to sort them and merges
the sorted sublists (note the recursive formulation). If the “time taken” by
procedure Mergesort(n) in terms of the number of comparisons is denoted
by T (n) then T (n) is known to satisfy the following recurrence relation:
jnk lnm
T (n) = T +T + n − 1, with T (2) = 1
2 2
Example 2.17.3:
Consider the problem of finding the number of binary sequences of length n
which do not contain two consecutive 1’s.
Let wn be the number of such sequences. Let un be the number of such
Chapter 2 Combinatorics 88
sequences whose last digit is a 1. Also, let vn be the number of such sequences
whose last digit is a 0. Obviously, wn = un + vn .
wn = wn−1 + wn−2 .
This equation is the same as that for Fibonacci numbers, which can be solved
with the initial conditions w1 = 2 and w2 = 3.
Example 2.17.4:
A circular disk is divided into n sectors. There are p different colors (of
paints) to color the sectors so that no two adjacent sectors get the same
color. We are interested in the number of ways of coloring the sectors.
Chapter 2 Combinatorics 89
n 1
2
3
4
Let un be the number of ways to color the disk in the required manner.
This number clearly depends upon both n and p. We form a recurrence
relation in n using the following reasoning as given in [56]. We construct two
mutually exclusive and exhaustive cases:
(i) The sectors 1 and 3 are colored differently. In this case, removing sector
2 gives a disk of n − 1 sectors. Sector 2 can be colored in p − 2 ways.
(ii) The sectors 1 and 3 are of the same color. In this case, removing sector
2 gives a disk of n − 2 sectors as sectors 1 and 3 being of the same color
can be fused as one. Sector 2 can be colored using any of the p−1 colors
(i.e., excluding the color of sector 1 or sector 3). For each coloring of
sector 2, we can color the disk of n − 2 sectors in un−2 ways. Thus, we
have the following recurrence relation:
un = (p − 2)un−1 + (p − 1)un−2
The recurrence relations in the above examples can be solved using the tech-
niques described in this section. In general, we can consider equations of the
form,
an = f (an−1 , an−2 , . . . , an−i ), where n ≥ i
The above equation is linear as it does not contain terms like an−i · an−j ,
a2n−i and so on; it is homogeneous as the linear combination of an−i is zero; it
is with constant coefficients because the ci ’s are constants. For example, the
well-known Fibonacci Sequence 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, . . . is defined by
the linear homogeneous recurrence, fn = fn−1 + fn−2 , when n ≥ 2 with the
initial conditions, f0 = 0 and f1 = 1.
It is easy to see that if fn and gn are solutions to (2.9), then so does a
linear combinations of fn and gn say pfn + qgn , where p and q are constants.
To solve (2.9), we try an = xn where x is an unknown constant. If an = xn
is used in (2.9) we should have,
c0 xn + c1 xn−1 + . . . + ck xn−k = 0.
p(x) ≡ co xk + c1 xk−1 + . . . + ck = 0.
Chapter 2 Combinatorics 91
Example 2.18.1:
Consider the Fibonacci Sequence as above. We can write the recurrence
relation as
fn − fn−1 − fn−2 = 0. (2.10)
Example 2.18.2:
Consider the relation,
with a0 = 1, a1 = 3 and a2 = 5.
From the given equation we directly write its chracteristic equation as,
x3 − 6x2 + 11x − 6 = 0,
Thus the roots of the characteristic equation are, 1, 2 and 3 and the solution
should be of the form, an = A1n + B2n + C3n .
Applying the initial conditions, we get,
a0 = 1 = A + B + C; a1 = 3 = A + 2B + 3C; and a2 = 5 = A + 4B + 9C
p(x) = co xk + c1 xk−1 + · · · + ck .
Let r be a multiple root occurring two times; that is, (x − r)2 is a factor of
p(x). We can write,
Example 2.18.3:
Consider the recurrence, an − 11an−1 + 39an−2 − 45an−3 = 0, when n > 3
with a0 = 0, a1 = 1 and a2 = 2.
We can write the characteristic equation as
x3 − 11x2 + 39x − 45 = 0,
the roots of which are 3, 3 and 5. Hence, the general solution can be written
as
an = (A + Bn)3n + C5n .
an = (1 + n)3n + 5n .
Example 2.19.1:
Chapter 2 Combinatorics 94
an − 3an−1 = 5n (2.11)
We can observe that 3n and 5n are not solutions to (2.11)! The reason is
that (2.11) implies (2.13), but (2.13) does not imply (2.11) and hence they
are not equivalent.
From the original equation we can write a1 = 3a0 + 5, where a0 is the initial
condition.
From (2.14) we get, A + B = a0 and 3A + 5B = a1 = 3a0 + 5.
Therefore, we should have, A = a0 − 5/2 and B = 5/2.
Hence, an = [(2a0 − 5)3n + 5n+1 ]/2.
We now try various possible candidate solutions for an , evaluate the left-
hand side above and check to see if it yields the required value for f (n) – the
required f (n) should be (2 − n). We tabulate the work as follows:
We note that row 1 and row 2 are not linearly independent. We also note
that in row 3, we have an = n which gives the correct f (n) but the initial
conditions are not correct. We can subtract row 1 from row 3 which gives
an = (n − 1) with the correct initial conditions. Thus, an = (n − 1) is the
required solution.
Next we consider the following example.
As before, we try various possibilities for an and look for the resulting f (n)
to get a “repertoire” of recurrences. We summarize the work in the following
table:
In the perturbation step, we note that the last term contains the “1 / n2 ”
factor; we therefore reason that it will bring a small contribution to the
recurrence; hence, approximately,
an+1 = 2an
so that,
n
Y
Pn+1 ≤ [1 + (1/4k 2 )].
k=1
The infinite product α0 corresponding to the right hand side above converges
monotonically to
∞
Y
α0 = (1 + (1/4k 2 )) = 1.46505 . . .
k=1
Functions
(ii) The series converges for some z 6= 0 if and only if the sequence {|an |n/2 }
is bounded. (If this condition is not satisfied, it may be that the se-
quence {an /n!} is convergent.)
1. If the given infinite sequence is {1, 1, 1, . . .} then it follows that the corre-
sponding generating function A(z) is given by,
2. If the given infinite sequence is {1, 1/1!, 1/2!, 1/3!, 1/4!, . . .} then
2.22.1 Convolution
Let A(z) and B(z) be the generating functions of the sequences {a0 , a1 , a2 , . . .}
and {b0 , b1 , b2 , . . .} respectively. The product A(z)B(z) is the series,
(a0 + a1 z + a2 z 2 + . . .) × (b0 + b1 z + b2 z 2 + . . .)
Chapter 2 Combinatorics 99
= a0 b0 + (a0 b1 + a1 b0 )z + (a0 b2 + a1 b1 + a2 b0 )z 2 + . . .
Pn
It is easily seen that, [z n ]A(z)B(z) = k=0 ak bn−k . Therefore, if we wish to
evaluate any sum that has the general form
n
X
cn = ak bn−k (2.17)
k=0
and if the generating functions A(z) and B(z) are known then we have cn =
[z n ]A(z)B(z).
The sequence {cn } is called the convolution of the sequences {an } and
{bn }. In short, we say that the convolution of two sequences corresponds to
the product of the respective generating functions. We illustrate this with an
example.
From the Binomial Theorem, we know that, (1 + z)r is the generating
function of the sequence {C(r, 0), C(r, 1), C(r, 2), . . .}. Thus we have,
X X
(1 + z)r = C(r, k)z k and (1 + z)s = C(s, k)z k
k≥0 k≥0
Example 2.22.1:
By taking the convolution of the sequence {1, 1, 1, . . .} (whose generating
function is 1/(1 − z) ) with itself we can immediately deduce that 1/(1 - z)2
is the generating function of the sequence {1, 2, 3, 4, 5, . . .}.
Example 2.22.2:
We can easily see that 1/(1 + z) is the generating function for the sequence
{1, −1, 1, −1, . . .}. Therefore 1/(1 + z)(1 − z) or 1/(1 − z 2 ) is the generat-
ing function of the sequence {1, 0, 1, 0, . . .} which is the convolution of the
sequences {1, 1, 1, . . .} and {1, −1, 1, −1, . . .}.
which is the generating function for the sequence {cn gn }. Thus 1/(1 − cz)
is the generating function of the sequence {1, c, c2 , c3 , . . .}.
Chapter 2 Combinatorics 101
3. Given G(z) in a closed form we can get G′ (z). Term by term differentiation
(when possible) of the infinite sum of G(z) yields,
Thus G′ (z) represents the infinite sequence {g1 , 2g2 , 3g3 , 4g4 , . . .} i.e., {(n+
1)gn+1 }. Thus, with a shift, we have “brought down a factor of n” into the
terms of the original sequence {gn }. Equivalently, zG′ (z) is the generating
function for {ngn }.
Thus by integrating G(z) we get the generating function for the sequence
{gn−1 /n}.
Example 2.23.1:
It is required to find the generating function of the sequence {12 , 22 , 32 , . . .}.
We have seen above that 1/(1 − x)2 is the generating function of the
sequence {1, 2, 3, . . .}.
By differentiation, we can see that 2/(1 − x)3 is the generating function
of the sequence {2.1, 3.2, 4.3, . . .}. In this sequence, the term with index k is
(k + 2)(k + 1) which can be written as (k + 1)2 + k + 1. We want the sequence
{ak } where ak = (k + 1)2 . By subtracting the generating function for the
sequence {1, 2, 3, . . .} from that for the sequence {2.1, 3.2, 4.3, . . .} , we get
the required answer as [2/(1 − x)3 ] − [1/(1 − x)2 ].
Chapter 2 Combinatorics 102
where the initial condition a0 = 1 has been used and it is assumed that G(z)
and H(z) are the generating functions of the sequences {a0 , a1 , a2 , . . .} and
{b0 , b1 , b2 , . . .} respectively. In a similar manner, from (2.19) we can obtain,
We multiply both sides of the above equation by z n+2 and sum over all n
obtaining,
X X X X
an+2 z n+2 − 3z an+1 z n+1 + 2z 2 an z n = nz n+2 .
n≥0 n≥0 n≥0
P
If we define G(z) = n≥0 an z n then the above equation can then be written
as,
(G(z) − z − 1) − 3z(G(z) − 1) + 2z 2 G(z) = z 3 /(1 − z)2 .
Note that in obtaining the above we have used the initial conditions a0 =
a1 = 1 and we have used the fact that the infinite sequence {0, 0, 0, 1, 2, 3, . . .}
has the generating function z 3 /(1 − z)2 . From the above, we get G(z) as,
To get [z n ]G(z) , we first express the right-hand side of G(z) above in terms
of partial fractions. We get,
n 1 o n 1 o n 1 o
G(z) = + − .
(1 − 2z) (1 − z)2 (1 − z)3
Using infinite series expansions of the terms on the right, it is easy to check
that,
[z n ]G(z) = an = 2n − (n2 + n)/2.
Example 2.23.2:
Consider the following sequence:
√
2. Consider the non-linear recurrence, an = (an−1 an−2 ), n > 1 with a1 = 2
and a0 = 1. We take the logarithm on both the sides and set bn = log an .
This gives,
By definition, B(z) = b0 + b1 z + b2 z 2 + b3 z 3 + . . ..
For n ≥ 1 the number of binary trees with n vertices can be enumerated
as the number of ordered pairs of the form (B1 , B2 ) where B1 and B2 are
binary trees that together have exactly (n − 1) vertices i.e., if B1 has k
vertices then B2 will have (n − k − 1) vertices where k can take the values
0, 1, 2, 3, . . . , (n − 1). Therefore the number bn of such ordered pairs is given
Chapter 2 Combinatorics 106
by,
bn = b0 bn−1 + b1 bn−2 + · · · + bn−1 b0 where n ≥ 1.
B(z) = 1 + z{B(z)}2
If z is such a real number so that the power series B(z) converges then B(z)
will also be a real number; then the above quadratic can be solved to give,
p p
1 + (1 − 4z) 1 − (1 − 4z)
B(z) = or
2z 2z
zB(z) = b0 z + b1 z 2 + b2 z 3 + · · · ,
1 − (1 − 4z)1/2 1 − (1 − 4z)1/2
zB(z) = or B(z) =
2 2z
Chapter 2 Combinatorics 107
By expanding (1 − 2z)1/2 in infinite series we can get the expansion for B(z)
and get,
n 1 2n
bn = [z ]B(z) =
n+1 n
The numbers bn are known as Catalan numbers.
Problem 2.24.2:
Average-case analysis of a simple algorithm to find the maximum in a list.
max := X[i];
goto 1;
2: - - -
Thus on a sequential computer, a compiled form of FindMax will execute:
We can thus conclude that the time, tmaxfn of FindMax has to be of the
form:
tmaxfn = c0 + c1 n + c2 EXCH[X]
where EXCH[X] is the number of times the instruction “max := X[i]” is
executed ( i.e., the number of “exchanges” that has taken place) and c0 , c1
and c2 are the implementation constants dependent on the machine where
the code runs. We note that EXCH[X] is 1 if X[1] is the largest element. Also,
EXCH[X] takes the maximum value n when the array X is already sorted in
the increasing order.
To get an estimate of the expected value of EXCH[X] we introduce the
“permutation model”. In this model we will assume that the array X is a
permutation of the integers (1, . . . , n). Then each permutation can occur (be
an input to FindMax) with equal probability 1/n!. Let sn,k be the num-
ber of those permutations wherein EXCH[X] is k. Then, if pn,k denotes the
Chapter 2 Combinatorics 109
To get the sum on the right-side above, we consider all those permutations
σ1 σ2 σ3 . . . σn of (1, . . . , n) wherein the value EXCH[X] is exactly k (by defini-
tion, there are exactly sn,k of these). With respect to these types of permu-
tations, we reason that the following two cases can occur:
(a) the last element σn is equal to n: in this case σ1 σ2 . . . σn−1 should have
produced exactly k − 1 exchanges because the last element being the
largest will surely force one more exchange. Thus the number of permu-
tations in this case is sn−1,k−1 .
(b) the last element σn is not equal to n: in this case σn is one of 1, 2, 3, . . . , (n−
1). Then σ1 σ2 . . . σn−1 should have produced exactly k exchanges because
the element (being less than the maximum) will not be able to force an
exchange. In this case the number of permutations is (n − 1)sn−1,k .
Thus we have,
sn,k = sn−1,k−1 + (n − 1)sn−1,k (2.23)
Multiplying both the sides of (2.23) by xk and summing over k from 1 through
n and using the definition (2.12) we get,
From the definition (2.24) we find S1 (x) = x. Then from (2.25) we get
S2 (x) = x(x + 1) etc. In general we find that the explicit form of Sn (x) is
given by
n−1
Y
Sn (x) = (x + j) (2.26)
j=0
tmaxfAV
n
G
= c0 + c1 n + c2 Hn .
Exercises
is always divisible by 2.
Step 4. If all sequences have been generated then stop; else go to Step
1.
3. How many ways are there to choose three or more people from a set of
eleven people?
4. Three boys and four girls are to sit on a bench. The boys must sit
together and the girls must sit together. In how many ways can this
be done?
6. Let X =⊆ {1, 2, 3, 4, . . . , (2n − 1)} and let |X| ≥ (n + 1). Prove the
following result (due to P. Erdös):
“There are two numbers a, b ∈ X, with a < b such that a divides b”.
In the above problem, if we prescribe |X| = n, will the above result be
still true?
8. Without using the general formula for φ(n) above, reason that when
n = pα (α is a natural number) is a prime power, φ(n) can be expressed
as pα (1 − 1/p).
P
10. For an arbitrary natural number n prove that, d|n φ(d) = n (the sum
is over all natural numbers dividing n).
13. Find the number of ways in which eight rooks may be placed on a
conventional 8 × 8 chessboard so that no rook can attack another and
the white diagonal is free of rooks.
Chapter 2 Combinatorics 113
Solve the above recurrence and hence show that fn ≤ en!, for all n > 1.
17. This problem is due to H. Larson (1977). Consider the infinite sequence
an with a1 = 1, a5 = 5, a12 = 144 and an + an+3 = 2an+2 .
Prove that an is the nth Fibonacci number.
where, a0 = 5, a1 = 3, a2 = 6, a3 = −21.
an = 6an−1 − 9an−2 ,
3.1 Introduction
3.2 Divisibility
Definition 3.2.1:
Let a and b be any two integers and a 6= 0. Then b is divisible by a (equiva-
lently, a is a divisor of b) if there exists an integer c such that b = ac.
115
Chapter 3 Basics of Number Theory 116
The proofs of these results are trivial and are left as exercises. (For
instance, to prove (iii), we note that if a divides b1 , . . . , bn , there exist integers
c1 , c2 , . . . , cn such that b1 = ac1 , b2 = ac2 , . . . , bn = acn . Hence b1 x1 + · · · +
bn xn = a(c1 x1 + · · · + cn xn ), and so a divides b1 x1 + · · · bn xn ).
with common difference |a| and extended infinitely in both the directions.
Certainly, this sequence contains a least non-negative integer r. Let this
term be b + q|a|, q ∈ Z. Thus
b + q|a| = r (3.1)
Chapter 3 Basics of Number Theory 117
Theorem 3.2.3 gives the division algorithm, that is, the process by means
of which division of one integer by a nonzero integer is made. An algorithm is
a step by step procedure to solve a given mathematical problem in finite time.
We next present Euclid’s algorithm to determine the gcd of two numbers a
and b. Euclid’s algorithm (Euclid B.C. ) is the first known algorithm in the
mathematical literature. It is just the usual algorithm taught in High School
Algebra.
integers
Definition 3.3.1:
Let a and b be two integers, at least one of which is not zero. A common
divisor of a and b is an integer c(6= 0) such that c | a and c | b. The greatest
common divisor of a and b is the greatest of the common divisors of a and b.
It is denoted by (a, b).
If c divides a and b, then so does −c. Hence (a, b) > 0 and is uniquely
defined. Moreover, if c is a common divisor of a and b, that is, if c | a and
Chapter 3 Basics of Number Theory 118
c | b, then
a = a′ c and b = b′ c
(a, b) = c(a′ , b′ )
so that c | (a, b). Thus any common divisor of a and b divides the gcd of a
and b. Hence (a, b) is the least common divisor of a and b that is divisible by
every common divisor of a and b. Moreover, (a, b) = (±a, ±b).
Proposition 3.3.2:
If c | ab and (c, b) = 1, then c | a.
Definition 3.3.3:
If a, b and c are nonzero integers and if a | c and b | c, then c is called a
common multiple of a and b.
The least common multiple (lcm) of a and b is the smallest of the positive
common multiples of a and b and is denoted by [a, b]. As in the case of gcd,
[a, b] = [±a, ±b].
Euclid’s Algorithm
Since (±a, ±b) = (a, b), we may assume without loss of generality that a > 0
and b > 0 and that a > b (If a = b, then (a, b) = (a, a) = a). By Division
Chapter 3 Basics of Number Theory 119
a = q 1 b + r1 , 0 ≤ r1 < b. (3.2)
b = q 2 r1 + r2 , 0 ≤ r2 < r1 . (3.3)
r1 = q 3 r2 + r3 , 0 ≤ r3 < r2 . (3.4)
Then (a, b) = rj .
Since rj | rj−1 , rj divides the expression on the right side of (3.7), and so rj |
rj−2 . Going backward, we get successively that rj divides rj−1 | rj−2 , . . . , r1 , b
and a. Thus rj | a and rj | b.
Next, let c | a and c | b. Then from equation (3.2), c | r1 , and this when
substituted in equation (3.3), gives c | r2 . Thus the successive equations of
Chapter 3 Basics of Number Theory 120
Theorem 3.3.4:
If rj = (a, b), then it is possible to find integers x and y such that
ax + by = rj (3.8)
Equation (3.9) expresses rj in terms of rj−1 and rj−2 while the equation
preceding it expresses rj−1 in terms of rj−2 and rj−3 . Thus
rj = rj−2 − qj rj−1
Thus we have expressed rj as a linear combination of rj−2 and rj−3 , the coef-
ficients being integers. Working backward, we get rj as a linear combination
of a, b with the coefficients being integers.
The process given in the proof of Theorem 3.8 is known as the Extended
Euclidean Algorithm.
Chapter 3 Basics of Number Theory 121
Corollary 3.3.5:
If (a, m) = 1, then there exists an integer u such that au ≡ 1(mod m) and
any two such integers are congruent modulo m.
Example 3.3.6:
Find integers x and y so that 120x + 70y = 1.
We apply Euclid’s algorithm to a = 120 and b = 70. We have
120 = 1 · 70 + 50
70 = 1 · 50 + 20
50 = 2 · 20 + 10
20 = 2 · 10
Now starting from the last but one equation and going backward, we get
10 = 50 − 2 · 20
= 50 − 2(70 − 1 · 50)
= 3 · 50 − 2 · 70
= 3 · (120 − 1 · 70) − 2 · 70
= 3 · 120 − 5 · 70.
3.4 Primes
Definition 3.4.1:
An integer n > 1 is a prime if its only positive divisors are 1 and n. A natural
number greater than 1 which is not a prime is a composite number.
Naturally, 2 is the only even prime. 3, 5, 7, 11, 13, 17, . . . are all odd
primes. The composite numbers are 4, 6, 8, 9, 10, . . .
Theorem 3.4.2:
Every integer n > 1 can be expressed as a product of primes, unless the
number n itself is a prime.
Proof. The result is obvious for n = 2, 3 and 4. So assume that n > 4 and
apply induction. If n is a prime there is nothing to prove. If n is not a prime,
then n = n1 n2 , where 1 < n1 < n and 1 < n2 < n. By induction hypothesis,
both n1 and n2 are products of primes. Hence n itself is a product of primes.
(Note that the prime factors of n need not all be distinct.)
We now show that this factorization is unique in the sense that in any prime
factorization, the prime factors that occur are the same and that the prime
powers are also the same except for the order of the prime factors. For
instance, 200 = 23 × 52 , and the only other way to write it in the form (3.10)
is 52 × 23 .
Chapter 3 Basics of Number Theory 123
Lemma 3.4.4:
If p is a prime such that p | (ab), but p ∤ a, then p | b.
Note 3.4.5:
Theorem 3.4.4 implies that if p is a prime and p ∤ a and p ∤ b, then p ∤
(ab). More generally, p ∤ a1 , p ∤ a2 , . . . p ∤ an imply that p ∤ (a1 a2 · · · an ).
Consequently if p | (a1 a2 · · · an ), then p must divide at least one ai , 1 ≤ i ≤ n.
β
n = pα1 1 pα2 2 · · · pαr r = q1β1 q2β2 · · · qj j · · · qsβs (3.11)
are two prime factorizations of n, where the pi ’s and qi ’s are all primes. As
α1 αr β1 βr
p1 | (p1 · · · pr ), p1 | q1 · · · qs . Hence by Note 3.4.5, p1 must divide some
qj . As p1 and qj are primes and p1 | qj , p1 = qj . Cancelling p1 on both the
sides, we get
β −1
pα1 1 −1 pα2 2 · · · pαr r = q1β1 q2β2 · · · qj j · · · qsβs . (3.12)
Chapter 3 Basics of Number Theory 124
β −α1
pα2 2 · · · pαr r = q1β1 q2β2 · · · qj j · · · qsβs . (3.13)
Now p1 divides the right hand expression of (3.13) and so must divide the
left hand expression of (3.13). But this is impossible as the pi ’s are distinct
primes. Hence α1 = βj . Cancellation of pα1 1 on both sides of (3.11) yields
β β
pα2 2 · · · pαr r = q1β1 q2β2 · · · qj−1
j−1 j+1
qj+1 · · · qsβs .
a = 23 · 32 · 50
Chapter 3 Basics of Number Theory 125
and b = 20 · 32 · 51 .
Then clearly,
r
Y min(αi , βi )
(a, b) = pi , and
i=1
r
Y max(αi , βi )
[a, b] = pi
i=1
Proof. The proof is by contradiction. Suppose there are only finitely many
primes, say, p1 , p2 , . . . , pr . Then the number n = 1 + p1 p2 · · · pr is larger
than each pi , 1 ≤ i ≤ r, and hence composite. Now any composite number
is divisible by some prime. But none of the primes pi , 1 ≤ i ≤ r, divides
n. (For, if pi divides n, then pi | 1, an impossibility). Hence the number of
primes is infinite.
Definition 3.4.8:
Two numbers a and b are coprime or relatively prime if they are prime to
each other, that is, if (a, b) = 1.
3.5 Exercises
3. Use the unique factorization theorem to prove that for any two positive
integers a and b, (a, b)[a, b] = ab, and that (a, b) | [a, b]. (Remark:
This shows that if (a, b) = 1, then [a, b] = ab. More generally, if
{a1 , a2 , . . . , ar }, is any set of positive integers, then (a1 , a2 , . . . , ar )
divides [a1 , a2 , . . . , ar ]. Here (a1 , a2 , . . . , ar ) and [a1 , a2 , . . . , ar ] denote
respectively the gcd and lcm of the numbers a1 , a2 , . . . , ar .
5. Show that there exist infinitely many pairs (x, y) such that x + y = 72
and (x, y) = 9.
Hint: One choice for (x, y) is (63, 9). Take x′ prime to 8, that is,
(x′ , 8) = 1 and take y ′ = 8 − x′ . Now use the pairs (x′ , y ′ ).
6. If a+b = c, show that (a, c) = 1, iff (b, c) = 1. Hence show that any two
consecutive Fibonacci numbers are coprime. (The Fibonacci numbers
Chapter 3 Basics of Number Theory 127
9. For a positive integer n, show that there exist integers a and b such
that n is a multiple of (a, b) = d and ab = n iff d2 | n.
(Hint: By Exercise 3 above, (a, b)[a, b] = ab = n and that (a, b) | [a, b].
Hence d2 | n. Conversely, if d2 | n, n = d2 c. Now take d = a and dc = b.)
12. Let pn denote the n-th prime (p1 = 2, p2 = 3 and so on). Prove that
n
pn > 22 .
n
13. Prove that if an = 22 + 1, then (an , an + 1) = 1 for each n ≥ 1. (Hint:
n
Set 22 = x).
14. If n = pa11 · · · par r is the prime factorization of n, show that d(n), the
number of distinct divisors of n is (a1 + 1) · · · (ar + 1).
Chapter 3 Basics of Number Theory 128
3.6 Congruences
Definition 3.6.1:
Given integers a, b and n (6= 0), a is said to be congruent to b modulo n, if
a − b is divisible by n, that is, a − b is a multiple of n. In symbols, it is
denoted by a ≡ b (mod n), and is read as a is congruent to b modulo n. The
number n is the modulus of the congruence.
Definition 3.6.2:
If f (x), g(x) and h(x) (6= 0) are any three polynomials with real coefficients
then by f (x) ≡ g(x) (mod h(x)), we mean that f (x) − g(x) is divisible by
h(x) over R, that is to say, there exists a polynomial q(x) with real coefficients
such that
f (x) − g(x) = q(x)h(x).
Proof. We prove only 3 and 4; the rest follow immediately from the definition.
Proposition 3.6.4:
If ab ≡ ac (mod m), and (a, m) = 1, then b ≡ c (mod m).
Corollary 3.6.5:
m
If ab ≡ ac (mod m), then b ≡ c (mod d
), where d = (a, m).
Proposition 3.6.6:
If (a, m) = (b, m) = 1, then (ab, m) = 1.
Proposition 3.6.7:
If ax ≡ 1 (mod m) and (a, m) = 1, then (x, m) = 1.
Proposition 3.6.8:
If a ≡ b (mod mi ), 1 ≤ i ≤ r, then a ≡ b (mod [m1 , . . . , mr ]), where
[m1 , . . . , mr ] stands for the lcm of m1 , . . . , mr .
Definition 3.7.1:
Given a positive integer m, a set S = {x1 , . . . , xm } of m numbers is called
a complete system of residues modulo m if for any integer x, there exists a
unique xi ∈ S such that x ≡ xi (mod m).
Euler’s φ-Function
Definition 3.7.2:
The Euler function φ(n) (also called the totient function) is defined to be
the number of positive integers less than n and prime to n. It is also the
cardinality of a reduced residue system modulo n.
We have seen earlier that φ(10) = 4. We note that φ(12) is also equal to
4 since 1, 5, 7, 11 are all the numbers less than 12 and prime to 12. If p is a
prime, then all the numbers in {1, 2, . . . , p − 1} are less than p and prime to
Chapter 3 Basics of Number Theory 133
Proof. Let r1 , . . . , rφ(n) be a reduced residue system modulo n. Now
(ri , n) = 1 for each i, 1 ≤ i ≤ φ(n). Further, as (a, n) = 1, by Proposi-
tion 3.6.6, (ari , n) = 1. Moreover, if i 6= j, ari 6≡ arj (mod n). For, ari ≡ arj (
mod n) implies (as (a, n) = 1), by virtue of Proposition 3.6.4, that ri ≡ rj (
mod n), a contradiction to the fact that r1 , . . . , rφ(n) is a reduced residue
system modulo n. Hence ari , . . . , arφ(n) is also a reduced residue system
modulo n and
φ(n) φ(n)
Y Y
(ari ) ≡ rj ( mod n).
i=1 j=1
Qφ(n) Qφ(n)
This gives that aφ(n) i=1 ri ≡
rj (mod n). Further (ri , n) = 1 for
j=1
Q
φ(n)
each i = 1, 2, . . . , φ(n) gives that i=1 ri , n = 1 by Proposition 3.6.6.
Consequently, by Proposition 3.6.4,
We see more properties of the Euler function φ(n) in Section 3.11. An-
other interesting theorem in elementary number theory is Wilson’s theorem.
Chapter 3 Basics of Number Theory 134
Theorem 3.7.5:
If u ∈ [1, m − 1] is a solution of the congruence ax ≡ 1 (mod m), then all
the solutions of the congruence are given by u + km, k ∈ Z. In particular,
there exists a unique u ∈ [1, m − 1] such that au ≡ 1(mod m).
p − 1 ≡ −1 ( mod p).
2 · 3 · · · (p − 2) ≡ 1 ( mod p),
since the multiplication of the three congruences (See Proposition 3.6.3) will
yield the required result.
Now, as p (≥ 5) is an odd prime, the cardinality of L is even, where
L = {2, 3, . . . , p − 2}. For each i ∈ L, by virtue of Corollary 3.3.5, there
Chapter 3 Basics of Number Theory 135
2 · 3 · · · (p − 2) ≡ 1 · · · 1 ≡ 1 ( mod p)
Example 3.7.7:
As an application of Wilson’s theorem, we prove that 712!+1 ≡ 0 (mod 719).
M (719) stands for a multiple of 719.
≡ 712! × 6! ( mod 719)
≡ 712! × 720 ( mod 719)
≡ 712! × (719 + 1) ( mod 719)
≡ (712! × 719) + 712! ( mod 719)
≡ 712! ( mod 719).
Chapter 3 Basics of Number Theory 136
Theorem 3.7.8:
If f (x) is a polynomial with integer coefficients, and a ≡ b(mod m), then
f (a) ≡ f (b) (mod m).
der Theorem
ax ≡ b ( mod m) (3.17)
Theorem 3.8.1:
Let (a, m) = 1 and b an integer. Then the linear congruence
ax ≡ b ( mod m) (3.18)
ax ≡ b ( mod m)
Chapter 3 Basics of Number Theory 138
ax ≡ 1 ( mod m)
Theorem 3.8.2:
Let (a, m) = d. Then the congruence
ax ≡ b ( mod m) (3.19)
a b m
x≡ mod .
d d d
and therefore
a0 x ≡ b0 ( mod m0 ) (3.21)
Chapter 3 Basics of Number Theory 139
Suppose there are more than one linear congruences. In general, they need
not possess a common solution. (In fact, as seen earlier, even a single linear
congruence may not have a solution.) The Chinese Remainder Theorem
ensures that if the moduli of the linear congruences are pairwise coprime,
then the simultaneous congruences all have a common solution. To start
with, consider congruences of the form x ≡ bi (mod mi ).
Theorem 3.8.3:
Let m1 , . . . , mr be positive integers that are pairwise coprime, that is, (mi , mj ) =
1 whenever i 6= j. Let b1 , . . . , br be arbitrary integers. Then the system of
congruences
x ≡ b1 ( mod m1 )
..
.
Chapter 3 Basics of Number Theory 140
x ≡ br ( mod mr )
x ≡ bi Mi Mi′ ( mod mi )
and, therefore,
x ≡ y ( mod mi ), 1 ≤ i ≤ r.
x ≡ y ( mod M ).
a1 x ≡ b1 ( mod m1 )
..
.
ar x ≡ br ( mod mr )
axes visible from the origin. Further, the point (2, 3) is visible from the
origin, but (2, 2) is not (See Figure 3.1). Hence we consider here lattice
points (a, b) not on the coordinate axes but visible from the origin. Without
loss of generality, we may assume that a ≥ 1 and b ≥ 1.
y
y
(a, b) b
b (2,3)
b (2,2) b (a′ , b′ )
b (1,1)
b x
x O
O
Figure 3.2:
Figure 3.1: Lattice points visible
from the origin
Lemma 3.9.1:
The lattice point (a, b) (not belonging to any of the coordinate axes) is visible
from the origin iff (a, b) = 1.
integers a′ , b′ . Then the lattice point (a′ , b′ ) lies on the segment joining (0,
0) with (a, b), and since a′ < a and b′ < b, (a, b) is not visible from the
origin.
Corollary 3.9.2:
The lattice point (a, b) is visible from the lattice point (c, d) iff (a−c, b−d) =
1.
Proof. Shift the origin to (c, d) through parallel axes. Then, the new origin
is (c, d) and the new coordinates of the original point (a, b) with respect to
the new axes are (a − c, b − d). Now apply Lemma 3.9.1.
Theorem 3.9.3:
The set of lattice points visible from the origin contains arbitrarily large
square gaps. That is, given any positive integer k, there exists a lattice point
(a, b) such that none of the lattice points
(a + r, b + s), 1 ≤ r ≤ k, 1 ≤ s ≤ k,
Proof. Let {p1 , p2 , . . .} be the sequence of primes. Given the positive integer
k, construct a k by k matrix M whose first row is the sequence of first
k primes p1 , p2 , . . . , pk , the second row is the sequence of next k primes,
Chapter 3 Basics of Number Theory 144
y
(a + t, b + k) (a + k, b + k)
b b
(a + r, b + s) b (a + k, b + t)
b
(a, b) b x
Let mi (resp. Mi ) be the product of the k primes in the i-th row (resp.
column) of M . Then for i 6= j, (mi , mj ) = 1 and (Mi , Mj ) = 1 because in
the products mi and mj (resp. Mi and Mj ), there is no repetition of any
prime. Now by Chinese Remainder Theorem, the set of congruences
x ≡ −1 ( mod m1 )
x ≡ −2 ( mod m2 )
..
.
x ≡ −k ( mod mk )
y ≡ −1 ( mod M1 )
y ≡ −2 ( mod M2 )
Chapter 3 Basics of Number Theory 145
..
.
y ≡ −k ( mod Mk )
3.10 Exercises
2. Show that the sum of the numbers less than n and prime to n is 12 nφ(n).
Definition 3.11.1:
An arithmetical function or a number-theoretical function is a function whose
domain is the set of natural numbers and codomain is the set of real or
complex numbers.
Chapter 3 Basics of Number Theory 147
Definition 3.11.2:
The Möbius function µ(n) is defined as follows:
µ(1) = 1;
Theorem 3.11.3:
If n ≥ 1, we have
X 1
1 if n = 1
µ(d) = = (3.23)
n
d|n 0 if n > 1.
(Recall that for any real number x, ⌊x⌋ stands for the floor of x, that is,
the greatest integer not greater than x. (For example, ⌊ 15
2
⌋ = 7). )
each ai ≥ 0 and hence the divisors of n for which the µ-function has nonzero
values are the numbers in the set {pσ1 1 · · · pσr r : σi = 0 or 1, 1 ≤ i ≤ r} =
{1; p1 , . . . , pr ; p1 p2 , p1 p3 , . . . , pr−1 pr ; . . . ; p1 p2 · · · pr }. Now µ(1) = 1; µ(pi ) =
(−1)1 = −1; µ(pi pj ) = (−1)2 = 1; µ(pi pj pk ) = (−1)3 = −1 and so on.
Further the number of terms of the form pi is n1 , of the form pi pj is n2 and
so on. Hence if n > 1,
X
n n r n
µ(d) = 1 − + − · · · + (−1)
1 2 r
d|n
= (1 − 1)r = 0
Theorem 3.11.4:
If n ≥ 1, we have
X n
φ(n) = µ(d) (3.24)
d
d|n
j k 1 j k
1 1
Proof. If (n, k) = 1, then (n, k)
= 1
= 1, while if (n, k) > 1, (n, k)
=
⌊a positive number less than 1⌋ = 0. Hence
n
X
1
φ(n) = .
k=1
(n, k)
Hence
n
X X
φ(n) = µ(d)
k=1 d|(n, k)
Chapter 3 Basics of Number Theory 149
n X
X
= µ(d). (3.25)
k=1 d|n
d|k
For a fixed divisor d if n, we must sum over all those k in the range 1 ≤ k ≤ n
which are multiples of d. Hence if we take k = qd, then 1 ≤ q ≤ n/d.
Therefore (3.25) reduces to
n/d
XX
φ(n) = µ(d)
d|n q=1
n/d
X X X n
= µ(d) 1= µ(d)
q=1
d
d|n d|n
Theorem 3.11.5:
P
If n ≥ 1, we have d|n φ(d) = n.
Theorem 3.11.6:
For n ≥ 2, we have
Y
1
φ(n) = n 1− (3.27)
p|n
p
p=a prime
P
Proof. We use the formula φ(n) = d|n µ(d) nd of Theorem 3.11.4 for the
proof. Let p1 , . . . , pr be the distinct prime factors of n. Then
Y Y r X 1 X 1 X 1
1 1
1− = 1− = 1− + − +··· ,
p i=1
pi i
pi i6=j pi pj pi pj pk
p|n
p=a prime
(3.28)
P 1
where, for example, the sum pi pj pk
is formed by taking distinct prime
divisors pi , pj and pk of n. Now, by definition of the µ-function, µ(pi ) = −1,
Chapter 3 Basics of Number Theory 151
µ(pi pj ) = 1, µ(pi pj pk ) = −1 and so on. Hence the sum on the right side of
(3.28) is equal to
X µ(pi ) X µ(pi pj ) X µ(d)
1+ + + ··· = ,
pi
pi pi ,pj
pi pj d
d|n
since all the other divisors of n, that is, divisors which are not products of
distinct primes, contain a square and hence their µ-values are zero. Thus
Y X
1 n
n 1− = µ(d) = φ(n) (by Theorem 3.11.4)
p d
p|n d|n
p=a prime
(v) φ(n) is even for n ≥ 3. Moreover, if n has k distinct odd prime factors,
then 2k | φ(n).
Q
1
(ii) We have φ(mn) = mn p|mn 1 − p . If p is a prime that divides
p=a prime
mn, then p divides either m or n. But then there may be primes which
divide both m and n and these are precisely the prime factors of (m, n).
Chapter 3 Basics of Number Theory 152
Q 1
Q
1
Y 1 p|m 1 − p
· p|n 1 − p
1− = Q
p 1− 1
p|mn p|(m, n) p
φ(m) φ(n)
·
= m n (by the product formula)
φ(d)
d
1 d
= φ(m)φ(n)
mn φ(d)
1
This gives the required result since the term on the left is φ(mn).
mn
An Application
We prove that
Then A is an upper triangular matrix (See Definition ??) with all diagonal
entries equal to 1. Hence det A = 1 = det At . Set S = At DA. Then
= 1 · (φ(1)φ(2) · · · φ(n)) · 1
= φ(1)φ(2) · · · φ(n).
We now show that S = (sij ), where sij = (i, j). This would prove our
statement.
Chapter 3 Basics of Number Theory 154
Now At = (bij ), where bij = aji . Hence if D is the matrix (dαβ ), then the
(i, j)-th entry of S is given by:
n X
X n
sij = biα dαβ aβj .
α=1 β=1
3.12 Exercises
3. Prove that the sum of the positive integers less than n and prime to n
1
is nφ(n).
2
4. Let σ(n) denote the sum of the divisors of n. Prove that σ(n) is multi-
plicative. Hence prove that if n = pa1r · · · par r is the prime factorization
of n, then
r
Y pai +1 − 1
i
σ(n) = .
i=1
pi − 1
The big O notation is used mainly to express an upper bound for a given
arithmetical function in terms of a another simpler arithmetical function.
Definition 3.13.1:
Let f : N → C be an arithmetical function. Then f (n) is O(g(n)) (read big
O of g(n)), where g(n) is another arithmetical function provided that there
exists a constant K > 0 such that
More generally, we have the following definition for any real-valued func-
tion.
Chapter 3 Basics of Number Theory 156
Definition 3.13.2:
Let f : R → C be a real valued function. Then f (x) = O g(x) , where
g : R → C is another function if there exists a constant K > 0 such that
Definition 3.13.3:
Let f : N → C be an arithmetical function. Then f (n) is O(g(n)), where
g(n) is another arithmetical function if there exists a constant K > 0 such
that
Definition 3.13.4:
Let n1 , . . . , nr be positive integers and let ni be a ki -bit integer (so that the
size of ni is ki ), 1 ≤ i ≤ r. An algorithm to perform a computation involving
n1 , . . . , nr is said to be a polynomial–time algorithm if there exist nonnegative
integers m1 , . . . mr such that the number of bit operations required to perform
the algorithm is the O(k1m1 . . . krmr ).
Recall that the size of a positive integer is the number of bits in it. For
instance, 8 = (1000) and 9 = (1001). So both 8 and 9 are of size 4. In fact
all numbers n such that 2k−1 ≤ n < 2k are k-bits. Taking logarithms with
respect to base 2, we get k − 1 ≤ log2 n < k and hence k − 1 ≤ ⌊log2 n⌋ < k,
so that ⌊log2 n⌋ = k − 1.Thus k = 1 + ⌊log2 n⌋ and hence k is O(log2 n). Thus
we have proved the following result.
Theorem 3.13.5:
The size of n is O(log2 n).
Note that in writing O(log n), the base of the logarithm is immaterial.
For, if the base is b, then any number that is O(log2 n) is O(logb n) and vice
verse. This is because log2 n = logb n, log2 b, and log2 b can be absorbed in
the constant K of Definition 3.13.1.
Example 3.13.6:
Let g(n) be a polynomial of degree t. Then g(n) is O(nt ).
Chapter 3 Basics of Number Theory 158
Theorem 3.13.7:
Euclid’s algorithm is a polynomial time algorithm.
Proof. We show that Euclid’s algorithm of computing the gcd (a, b), a > b,
can be performed in time O(log3 a) .
Adopting the same notation as in (3.5), we have
1
Hence rj+2 < r
2 j
for each j. This means that the remainder in every
other step in the Euclidean algorithm is less than half of the original re-
mainder. Hence if a = O(2k ), then are at most k steps in the Euclidean
Chapter 3 Basics of Number Theory 159
k−1
Then c = bk−1 2k−1 + bk−2 2k−2 + · · · + b0 20 , and therefore ac = abk−1 2 ·
k−2
abk−2 2 · ab1 2 ab0 , where each bi = 0 or 1. We now compute ac (mod m)
recursively by reducing the number computed at each step by mod m. Set
y0 = abk−1 = a
There are k−1 steps in the algorithm. Note that yi+1 is computed by squaring
yi and multiplying the resulting number by 1 if bk−i−2 = 0 or else multiplying
the resulting number by a if bk−i−2 = 1. Now yi (mod m) being a O(log m)
number, to compute yi2 , we make O(log2 m) = O(t2 ) where t = O(log2 m)
bit operations. yi being t-bit, yi2 is a 2t or (2t + 1) bit number and so
it is also a O(t)-bit number. Now we reduce yi2 modulo m, that is, we
divide the O(t) number yi2 by the O(t) number m. Hence this requires an
additional O(t2 ) bit operations. Thus in all we have performed until now
O(t2 ) + O(t2 ) bit operations, that is O(t2 ) bit operations. Having computed
yi2 (mod m), we next multiply it by a0 or a1 . As a is O(log2 m) = O(t),
this requires O(t2 ) bit operations. Thus in all, computation of yi+1 from yi
requires O(t2 ) bit operations. But then there are k − 1 = O(log2 c) steps
in the algorithm. Thus the number of bit operations in the computation of
ac (mod m) is O(log2 c log22 m) = O(kt2 ). Thus the algorithm is a polynomial–
time algorithm.
Next, we give an algorithm that is not a polynomial–time algorithm.
(Sieve of Eratosthenes)
Chapter 4
Mathematical Logic
The study of logic can be traced back to the ancient Greek philosopher Aris-
totle (384–322 B.C ). Modern logic started seriously in mid-19th century
mainly due to the British mathematicians George Boole and Augustus de
Morgan. The German mathematician and philosopher Gottlob Frege (1848–
1925) is widely regarded as the founder of modern mathematical logic. Logic
is implicit in every form of common reasoning. It is concerned with the rela-
tionships between language (syntax), reasoning (deduction and computation)
and meaning (semantics). A simple and popular definition of logic is that it
is the analysis of the methods of reasoning. In the study of these methods,
logic is concerned with the form of arguments rather than the contents or the
meanings associated with the statements. To illustrate this point, consider
the following arguments:
(a) All men are mortal. Socrates is a man. Therefore Socrates is a mortal.
161
Chapter 4 Mathematical Logic 162
Both (a) and (b) have the same form: All A are B; x is an A; therefore x
is a B. The truth or the falsity of the premise and the conclusion is not the
primary concern. Again, consider the following pattern of argument:
4.1 Preliminaries
“ 2 is an even number”
“ 52! is always less than 100!”
“If x, y and z are the sides of a triangle, then x + y = z”
“Heterological is heterological”.
p ¬p p q p∧q p q p∨q
⊤ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥
⊥ ⊤ ⊥ ⊤ ⊥ ⊥ ⊤ ⊤
⊤ ⊥ ⊥ ⊤ ⊥ ⊤
⊤ ⊤ ⊤ ⊤ ⊤ ⊤
realize that money is sufficient to buy and therefore own a car. By a similar
reasoning, the implication, “I have money” implies “I own a car” appears to
be true. The implication, “I don’t have money” implies “I don’t own a car”
appears to be not a false statement. The implication, “I don’t have money”
implies “I own a car” is not clear. If money is a necessary prerequisite to
buy, and therefore own a car, then this last implication is false. But since p
and q are unrelated, we can, in a relaxed manner, reason that in the absence
of money also, owning a car may be possible. We make this allowance and
take the implication to be true.
If p ⇒ q is true then p is said to be a stronger assertion than q. For
example, consider the implications,
Example 4.1.1:
f (x) = x2 ⇒ f ′ (x) = 2x has the contrapositive form f ′ (x) 6= 2x ⇒ f (x) 6= x2
Example 4.1.2:
√ √
a ≥ 0 ⇒ a is real has the contrapositive form a is not real ⇒ a < 0.
The biconditional of p and q has the truth-value ⊤ if p and q both have the
truth-value ⊤ or both have the truth-value ⊥; otherwise, the biconditional
has truth-value ⊥.
Truth Values
Using the different logical connectives introduced above, given any set of
prepositional variables, we can form meaningful fully parenthesized proposi-
tions or formulas. Below, we first define the rules which describe the syntactic
structure of such formulas:
(ii) Assertions denoted by small letters such as p, q etc., are atomic propo-
sitions or formulas (which are defined to take values ⊥ or ⊤).
(iv) If P and Q are any propositions then so are (P ∧ Q), (P ∨ Q), (P ⊕ Q),
(P ⇒ Q) and (P ⇔ Q).
Note:
(ii) When there is no confusion, we relax the above rules and we will denote
(P ) simply by P , (R ∧ S) simply by R ∧ S etc.
The rules given above specify the syntax for propositions formed using
the different logical connectives. The meaning or semantics of such formulas
is given by specifying how to systematically evaluate any fully parenthesized
proposition. The following cases arise in evaluating a fully parenthesized
proposition:
Case 3. The value of a formula with more than one operator is found by
repeatedly applying the Case 2 to the subformulas and replacing every
subformula by its value until the given proposition is reduced to ⊥ or
⊤.
Example 4.2.1:
Consider evaluating (T ∧T ) ⇒ F . We first substitute the values for T and
F and get the proposition (⊤ ∧ ⊤) ⇒ ⊥ . This is first reduced to (⊤ ⇒ ⊥)
by evaluating (⊤ ∧ ⊤) and then is reduced to ⊥.
Example 4.2.2:
To construct the truth table for the proposition (p ⇒ q) ∧ (q ⇒ p) , we
proceed in stages by assigning all possible combinations of truth values to the
prepositions p and q. We then evaluate the “inner” propositions and finally
determine the truth-value of the given proposition. This is summarized in
the following truth table which contains the intermediate results also.
p q p⇒q q⇒p (p ⇒ q) ∧ (q ⇒ p)
⊥ ⊥ ⊤ ⊤ ⊤
⊥ ⊤ ⊤ ⊥ ⊥
⊤ ⊥ ⊥ ⊤ ⊥
⊤ ⊤ ⊤ ⊤ ⊤
We note that the given formula has truth value ⊤ whenever p and q have
identical truth values.
Example 4.2.3:
Chapter 4 Mathematical Logic 170
To construct the truth table for the proposition (p ∧ q) ∨ (¬ r) ⇔ p we
have to consider all combinations of truth values to p, q and r. This results
in the following table.
p q r (p ∧ q) (¬ r) (p ∧ q) ∨ (¬ r) (p ∧ q) ∨ (¬ r) ⇔ p
⊥ ⊥ ⊥ ⊥ ⊤ ⊤ ⊥
⊥ ⊥ ⊤ ⊥ ⊥ ⊥ ⊤
⊥ ⊤ ⊥ ⊥ ⊤ ⊤ ⊥
⊥ ⊤ ⊤ ⊥ ⊥ ⊥ ⊤
⊤ ⊥ ⊥ ⊥ ⊤ ⊤ ⊤
⊤ ⊥ ⊤ ⊥ ⊥ ⊥ ⊥
⊤ ⊤ ⊥ ⊤ ⊤ ⊤ ⊤
⊤ ⊤ ⊤ ⊤ ⊥ ⊤ ⊤
The last two examples above illustrate that, given a propositional formula
we can determine its truth value by considering all possible combinations of
truth values to its constituent atomic variables. On the other hand given the
propositional variables (such as p, q, r, etc.), for each possible combination of
truth values to these variables we can define a functional value by assigning
a corresponding truth value (conventionally denoted in Computer Science
by 0 or 1) to the function. This corresponds to the definition of a Boolean
function.
Definition 4.2.4:
A Boolean function is a function f : {0, 1}n → {0, 1}, where n ≥ 1.
Definition 4.2.5:
A truth-assignment A is a function from the set of atomic variables to the
set {⊥, ⊤} of truth values.
Definition 4.2.6:
Let A be any function from the set of atomic variables to {⊥, ⊤}. Then, we
extend the notion of A as follows:
= ⊥ otherwise.
= ⊥ otherwise.
Chapter 4 Mathematical Logic 172
3. A(¬ R) = ⊤ if A(R) =⊥
= ⊥ otherwise.
= ⊥ otherwise.
= ⊥ otherwise.
= ⊥ otherwise.
cepts
for this problem but no one has proved that an efficient solution does not
exist. In classical complexity theory, the problem is termed NP-Complete.
A formula P is valid if every truth-assignment appropriate to P , verifies
P . We then call P , a tautology or a universally valid formula.
Trivially T is a tautology and F is not. It easily follows that the proposi-
tion (P ∨ ¬ P ) is a tautology. It is easy to see that if S and T are equivalent,
then (S ⇔ T ) is a tautology.
A tautology can be established by constructing a truth table as the fol-
lowing example shows.
Example 4.3.1:
To show that (¬ p) ⇒ (p ⇒ q) is a tautology.
We do this by constructing the following truth-table:
p q (¬ p) (p ⇒ q) ¬ p ⇒ (p ⇒ q)
⊥ ⊥ ⊤ ⊤ ⊤
⊥ ⊤ ⊤ ⊤ ⊤
⊤ ⊥ ⊥ ⊥ ⊤
⊤ ⊤ ⊥ ⊤ ⊤
(i) Addition: p ⇒ (p ∨ q)
(ii) Simplification: (p ∧ q) ⇒ p
(iii) Modus Ponens: p ∧ (p ⇒ q) ⇒ q
(iv) Modus Tollens: (p ⇒ q) ∧ ¬ q ⇒ ¬ p
(v) Disjunctive Syllogism: ¬ p ∧ (p ∨ q) ⇒ q
(vi) Hypothetical Syllogism: (p ⇒ q) ∧ (q ⇒ p) ⇒ (p ⇒ r)
Definition 4.3.2:
Two propositional formulas P and Q are said to be equivalent (and we write
P ≡ Q), if and only if they are assigned the same truth-value by every truth
assignment appropriate to both.
Chapter 4 Mathematical Logic 175
Example 4.3.3:
We show that it is possible to use the laws of equivalence to check if a formula
is a tautology. Consider the formula P where,
P = ¬ (p ⇒ q) ∧ ¬ (¬ p ⇒ (q ∨ r) ⇒ (¬ q ⇒ r)
¬p ∨ p ∨ q ∨ r
Let P ≡ Q and let R be any formula involving P ; say, R = (A ∨ P ) ∧
¬ (C ∧ P ) . In the process of the evaluation of R, every occurrence of P
will result in a truth-value. In each such occurrence of P in R, the same
truth-value will result if any (all) occurrence(s) of P is (are) replaced by Q
in R. Thus, in general, if R′ is the formula that results by replacing some
occurrence of P by Q within R, then we will have R ≡ R′ .
Definition 4.4.1:
A formula is in conjunctive normal form (CNF) if it is a conjunction of
disjunctions of literals. Similarly, a formula is in disjunctive normal form
(DNF) if it is a disjunction of conjunctions of literals.
For example, the formula (a ∧ b) ∨ (¬ a ∧ c) is in DNF and the formula
(a ∨ b) ∧ (b ∨ c ∨ ¬ a) ∧ (a ∨ c) is in CNF. The formulas (a ∨ b) and (a ∧ ¬ c)
are both in CNF as well as in DNF.
(Notation: if F = ¬ G, then let F ′ = G; if F is not a a negation of any
formula, let F ′ = ¬ F ; this fits well with the case if F is atomic say A; then
A′ = ¬ A and ¬ A′ = A)
Theorem 4.4.2:
Every formula has at least one CNF and at least one DNF representation.
Furthermore, there is an algorithm that transforms any given formula into a
CNF or a DNF as desired.
is a DNF of F .
Similarly a CNF of F can be obtained from a DNF of G.
(b) F = (G1 ∨ G2 ) for some formulas G1 and G2 each of which has k or
fewer occurrences of logical connectives. To obtain a DNF of F we simply
take the disjunction of a DNF of G1 and DNF of G2 . To obtain a CNF of
m
n
F , find CNFs of G1 and G2 , such as ∧ Hi and ∧ Ji , where H1 , . . . ,
i=1 j=1
Hm and J1 , . . . , Jn are disjuncts of literals, then by Distributivity Law
m n
∧ ∧ (Hi ∨ Ji ) ,
i=1 j=1
4.5 Compactness
Theorem 4.5.1:
A set of formulas is satisfiable if and only if each of its finite subsets is
satisfiable. (Alternatively, if a set of formulas is unsatisfiable, some finite
subset of the given set must be unsatisfiable.)
be defined on all the atomic subformulas of S. We, however, claim that from
{A0 , A1 , A2 , . . .} we can construct a truth-assignment A that does verify
S. For each n ≥ 0, this construction specifies how to build, a set Un of
truth-assignments. The construction specifies:
Calculus
We first introduce the idea of a clause and a clause set. Consider a CNF
formula F = (p ∨ q) ∧ (¬ r ∨ s ∨ t) . There are other equivalent ways of
writing F e.g., (q ∨ p) ∧ (¬ r ∨ t ∨ s) . All such other syntactical forms of
F and F itself can therefore be just regarded as a set of set of literals. For
n o
example, F is captured by the set {p, q}, {t, s, ¬ r} .
Formally, a clause is a finite set of literals. Each disjunction of literals
corresponds to a clause and each nonempty clause corresponds to one or more
disjunction of literals. We also allow the empty set as a clause, in which case,
it does not correspond to any formula. We write 2 for the empty clause.
A clause set is a set of clauses (empty clause allowed), possibly empty and
may be infinite. Every CNF formula naturally corresponds to a clause set and
every finite clause set not containing the empty clause and not itself empty
corresponds to one or more formulas in CNF. The empty clause set is not
the same as the empty clause, although they are identical when considered
as sets. We write ∅ to denote the empty clause set.
We can carry forward the notion of a truth-assignment A as appropriate
to a clause set. If S is a clause set and every atomic formula in S is in the
Chapter 4 Mathematical Logic 183
Example 4.6.1:
n o
Let S be the clause set {p, q}, {¬ r} . If a truth-assignment A verifies p,
verifies ¬ q and verifies ¬ r, then, A also verifies S. On the other hand, if A
verifies all of p, q and r then it follows that A does not verify S.
must also be verified by the same truth- assignment. In fact the clause set
n o
{p, q, ¬ r}, {q, r, s}, {p, q, s} can be shown to be equivalent to S. We say
that the clause set {p, q, s} is a resolvent of the clause sets {p, q, ¬ r} and
{q, r, s}.
Definition 4.6.2:
Let C1 and C2 be two clauses. Then the clause D is a resolvent of C1
and C2 if and only if for some literal l we have l ∈ C1 and ¬ l ∈ C2 and
D = C1 \{l} ∪ C2 \{¬ l} .
Let S be a clause set with two or more elements and let D be a resolvent of
any two clauses in S. Then S ≡ S ∪ {D}.
Starting with a clause set S, the resolution rule allows us to build a new
clause set S ′ by adding all resolvents to S (in the sense that S and S ′ are
Chapter 4 Mathematical Logic 185
equivalent). Now we can replace S by S ′ and add new resolvents again and
we can repeat this procedure as long as it is possible. We precisely formalize
this in the following:
Let S be a finite clause set with two or more elements. We define,
In other words, AddRes⋆ (S) is the closure of S under the operation AddRes
of adding all resolvents of clauses already present. Since each clause in S is
finite, there are only a finite number of clauses that can be formed by the
atomic formulas appearing in S. Hence only a finite number of resolvents
can ever be formed starting from a finite S. So there exists an i > 0 such
that, AddResi+1 (S) =AddResi (S). Then AddRes⋆ (S) = AddRes i (S). By
induction, it follows that S ≡ AddRes⋆ (S).
Determining AddRes⋆ (S) by repeated application of the resolution rule
can be used to find out whether a clause set is satisfiable. In turn, this
implies that we can have a computational procedure to determine whether a
formula is satisfiable. This is possible because of the following theorem:
Strictly speaking, not all the intermediate resolvents in AddRes⋆ (S) are
really needed to check whether 2 ∈ AddRes⋆(S). The following example is
illustrative.
Example 4.6.4:
Show that the following formula is unsatisfiable:
(A ∧ B) ∨ (A ∧ C) ∨ (B ∧ C) ∨ (¬ A ∧ ¬ B) ∨ (¬ A ∧ ¬ C) ∨ (¬ B ∧ ¬ C)
Chapter 4 Mathematical Logic 188
Starting with the given clause set, the following self-explanatory “tree dia-
gram” gives the intermediate resolvents that are needed to show that 2 is
also a resolvent belonging to AddRes⋆ (S):
{C} {¬C}
The following exercise gives an important result where the resolution tech-
nique holds the key.
During program execution, when the if- statement is encountered, the cur-
rent value of x is substituted to determine whether (x > 3) evaluates to true
or false (and then the program control is transferred conditionally).
Chapter 4 Mathematical Logic 190
Both the above assertions are true and hence these are propositions. To
succinctly express the above, we need two special symbols ∀ (read for-all )
and ∃ (read there-exists) respectively known as universal quantifier and ex-
istential quantifier . We can now write the above propositions as,
∀x (x mod 2 = 0) ⇒ (x > 1)
∀x∃y∃z (x mod 2) = 0 ⇒ (y + z) = x
—this says, “for all x, if x is a man then there exists a y such that y is a
truck and x drives y.” In other words, it says “every man drives at least one
truck”. The following formula,
∀y T (y) ⇒ ∃x M (x) ∧ D(x, y)
says that every truck is driven by at least one man. We remark that paren-
theses add to clarification. For example, the formula ¬ T (y) ∧ D(x, y) can be
Chapter 4 Mathematical Logic 191
interpreted in two ways. It can denote ¬ T (y) ∧ D(x, y) which means, “y
is not a truck and x drives y”. Alternately, it can denote, ¬ T (y) ∧ D(x, y)
which means, “y is not a truck that x drives”.
More generally, we can make statements about relations that are not
explicitly specified. For example, ∀x∀yP (x, y) ⇔ P (y, x) simply states that
P is a symmetric relation; the formula ∃x¬ P (x, x) can be interpreted to
mean that P is not reflexive.
Apart from predicate signs and variables, predicate calculus also allows
function signs, Thus, if f is a function sign corresponding to a binary function
f (x, y) where x and y are variables (denoting objects in a fixed universe),
then ∀x∀yP f xyf yf yx (which may be informally written as)
∀x∀yP f (x, y), f y, f (y, x)
is a legal formula.
Formulas in Predicate Calculus turn out to be true or false depending on
the interpretation of predicates and function signs. Thus ∃x¬ P xx is true if
and only if P is interpreted as a binary relation that is not reflexive. Also,
the formula ∀x∀y∀zP xy ∧ P yz ⇒ P xz is true if and only if P is interpreted
as a transitive relation. In the universe of discourse U , for any predicate P
and for any element m ∈ U , the formula ∀x P (x) ⇒ P (m) is always true.
In the sequel, we will prefer to use the informal style of writing the formulas
to provide some clarity.
(c) If F is any formula and x is a variable then ∀xF and ∃xF are
formulas.
Thus the truth value of the predicate depend on the values of m, n and x but
not on the value of i. It is obvious that the truth value of the predicate does
Chapter 4 Mathematical Logic 194
not change if all occurrence of i are replaced by a new variable, say j. The
variable i is bound to the quantifier ∀ in the predicate. The variables m, n
and x are free in the predicate. A predicate such as (i > 0)∧(∀i(x∗i > 0)) can
be confusing. In such cases we rewrite the predicate to remove the ambiguity.
For example, we can rewrite this last predicate as (i > 0) ∧ (∀j(x ∗ j) > 0).
More precisely the free variable of a formula are defined inductively as follows:
(a) The free variables of an atomic formula are all the variables occurring in
it.
(b) The free variables of the formulas (F ∨G) or (F ∧G) are the free variables
of F and the free variables of G; the free variables of ¬ F are the free
variables of F
(c) The free variables of ∀xF and ∃xF are the free variables of F , except for
x (if x happens to be a free variable in F ).
When there are no free occurrences of any variable in a formula, the formula
is called a closed formula or sentence.
Calculus
4.9.1 Structures
A structure A is a pair ([A],IA ). where [A] is any nonempty set called the
universe of A and IA is a function whose domain is a set of predicate and
function signs. Specifically
Example 4.9.1:
Let P be a 2-place predicate and f be a 1-place function sign and let F be
the formula ∀xP (x, f (x)).
We regard F as true in the structure A since every number is less than its
successor. If we define a new structure B which is the same as A except that,
P A = {(m, n)| m, n ∈ N and m > n}, then F is false in B.
The formula P (f (x), y) cannot be regarded as true or false in A or in B
without knowing what x and y are.
The next subsection formalizes these ideas.
This completes the task of evaluating the truth values of formulas in pred-
icative calculus.
We write A ξ G if and only if A(G)ξ = ⊤.
Example 4.9.2:
Let us consider the structure A of Example 4.9.1 and the formula P (x, f (y)).
Let ξ be the function from {x, y} to [A] = N such that,
ξ(x) = 1 ξ(y) = 2
Example 4.9.3:
For the structure A and the function ξ as in example 4.9.2, let us consider
the formula ∀ xP (x, f (x)). By definition, A ∀ xP (x, f (x)) = ⊤ if and only
ξ
of A(x, f (x))ξ[x/a] = ⊤ for each a ∈ [A] = N.
Now A P (x, f (x)) = ⊤ if and only if A(x)ξ[x/a] , A(f (x)ξ[x/a] ) ∈ P A .
ξ[x/a]
We see that A(x)ξ[x/a] = a and A(f (x))ξ[x/a] = f A(f (x))ξ[x/a] = f A (a) =
A
Example 4.9.4:
Let L be a 2-place predicate sign and f , a 2-place function sign. Let x and
Chapter 4 Mathematical Logic 199
[A] = N
f A (m, n) = m + n
if and only if
A ∀ yL(x, f (x, y)) = ⊤ for each a ∈ [A]
ξ[x/a]
culus
Two given formulas F and G are equivalent if and only if, for every struc-
ture A and function ξ appropriate to both F and G, A(F )ξ = A(G)ξ .As in
propositional calculus, write F ≡ G if F and G are equivalent.
It should be obvious that ≡ is an equivalence relation on the set of sen-
tences. All the laws of equivalence in propositional calculus (seen earlier)
Chapter 4 Mathematical Logic 201
¬ ∀xF ≡ ∃x¬ F
¬ ∃xF ≡ ∀x¬ F
(b) For any formulas F and G variable x, such that x has no occurrence in
G we have,
(∀xF ∨ G) ≡ ∀x(F ∨ G)
(∀xF ∧ G) ≡ ∀x(F ∧ G)
(∃xF ∨ G) ≡ ∃x(F ∨ G)
(∃xF ∧ G) ≡ ∃x(F ∧ G)
Proof. (a) We prove that ¬ ∀xF ≡ ∃x¬ F . The other equivalence can be
proved in a similar way.
A(¬ ∀xF )ξ = ⊤
Chapter 4 Mathematical Logic 202
A((∀xF ∨ G))ξ = ⊤
i.e., if and only if, A(F )ξ[x/a] = ⊤ for each a ∈ [A] or A(G)ξ = ⊤
Now A ∀x(F ∧ G) ξ = ⊤
A formula is in prenex form (or prenex normal form) if and only if all quanti-
fiers (if any) occur at the extreme left without intervening parentheses. The
prenex form is
Q1 v1 . . . Qn vn G
Step 1 Rename the variable, if necessary, so that no variable is both free and
bound and so that there is at most once occurrence of a quantifier with
any particular variable (we get what is called as a rectified formula).
Chapter 4 Mathematical Logic 204
It is easy to see that a give n formula may have different prenex forms.
Example 4.11.1:
We wish to get the prenex form of the formula.
The given formula is equivalent to the rectified formula (¬ ∀xP (x, y)∨∀zR(z, y)).
This formula is equivalent to (∃ x¬ P (x, y) ∨ ∀zR(z, y)) which is of the form
(∃ xF ∨ G) with x having no free occurrence in G, which is equivalent to
∃ x(F ∨ G). So the formula is equivalent to ∃ x(¬ P (x, y) ∨ ∀ zP (z, y)) ≡
∃ x(∀ zP (z, y) ∨ ¬ P (x, y)) ≡ ∃ x∀ z(P (z, y) ∨ ¬ P (x, y)) which is in prenex
form.
ways of substituting values for the universally quantified variables so that the
matrix urns out to be true in each case. For the above case, we first select a
1-place function sign f and a 0-place function sign a. We next replace F by
its functional form,
∀ yP (y, f (y)) ∧ ∀ w ¬ P (w, a)
Here f (y) denotes a choice for the object x corresponding to y such that
P (y, x) holds. For each y, there can be many possible x—we simply say
that there must be at least one choice for x. Similarly a stands for some
fixed object such that ¬ P (w, a) holds, whatever be the value of w. To make
P (y, x) true we can also take y to be the object f (a) itself and take x to be
f (f (a)), and so on.
The Herbrand Universe of F is the set of terms that can be formed from
a and f , namely,
{a, f (a), f (f (a)), . . .}
follows:
n
Let us take the formula G = (∀ y ∃xP (y, x) ∧ ∃ z∀ w ¬ P (a, w). The cor-
responding functional form is, ∀ yP (y, f (y)) ∧ ∀ w ¬ P (a, w). We have the
Herbrand expansion as,
n
Exercises
p q p↑q p q p↓q
⊥ ⊥ ⊤ ⊥ ⊥ ⊤
⊥ ⊤ ⊤ ⊥ ⊤ ⊥
⊤ ⊥ ⊤ ⊤ ⊥ ⊥
⊤ ⊤ ⊥ ⊤ ⊤ ⊥
3. Show that:
(a) P ∨ (Q ∧ R) ≡ (P ∨ Q) ∧ (P ∨ R)
(b) ¬ (P ∨ Q) ≡ (¬ P ∧ ¬ Q)
(c) (P ∨ Q) ≡ Q, if P is unsatisfiable.
4. Show that the formula (a ∨ b ∨ c) ∧ (c ∨ ¬ a) is equivalent to the
formula c ∨ (b ∧ ¬ a) .
5. Let S be a finite clause set such that |C| ≤ 2 for each C ∈ S. Show that
the resolution technique provides a polynomial-time decision procedure
for determining the satisfiability of S.
8. Show that the definitions 4, 5 and 6 in Definition 4.2.6 are direct con-
sequences of the other definitions.
Chapter 5
Algebraic Structures
5.1 Introduction
5.2 Matrices
a11 a12 . . . a1n
a21 a22 . . . a2n
A= .
. . . . . . . . . . .
am1 am2 . . . amn
209
Chapter 5 Algebraic Structures 210
tiplication of Matrices
Thus Cij = Ri · Cj′ . Both Ri and Cj′ are vectors of length n. It is well-
known that the matrix product satisfies both the distributive laws and the
associative laws, namely, for matrices A, B and C,
A(B + C) = AB + AC,
(AB)C = A(BC)
Theorem 5.3.1:
For any square matrix A of order n,
ai1 A1j + ai2 A2j + · · · + ain Anj = A1i a1j + A2i a2j + · · · + Ani anj = det A or 0
Chapter 5 Algebraic Structures 212
Corollary 5.3.2:
1
Let A be a nonsingular matrix that is, (det A 6= 0). Set A−1 = adj A
(det A).
Then AA−1 = A−1 A = In , where n is the order of A.
The matrix A−1 , as defined in Corollary 5.3.2, is called the inverse of the
(nonsingular) matrix A. If A, B are square matrices of the same order with
AB = I, then B = A−1 and A = B −1 . This is seen by premultiplying the
equation AB = I by A−1 and postmultiplying it by B −1 . Note that A−1 and
B −1 exist since taking determinants of both sides of AB = I, we get
is skew-symmetric if aij = −aji for all i and j. Clearly, symmetric and skew-
symmetric matrices are square matrices. If A = (aij ) is skew-symmetric,
then aii = −aii , and hence aii = 0 for each i. Thus in a skew-symmetric
matrix, all the diagonal entries are zero.
A real matrix (that is, a matrix whose entries are real numbers) P of order
n is called orthogonal if P P t = In . If P P t = In , then P t = P −1 . Thus the
inverse of an orthogonal matrix is its transpose. Further as P −1 P = In , we
also have P t P = In . If R1 , . . . , Rn are the row vectors of P , the relation
P P t = In implies that Ri · Rj = δij , where δij = 1 if i = j, and δij = 0 if
i 6= j. A similar statement also applies to the column vectors of P . As an
example, the matrix ( −cos α sin α
sin α cos α ) is orthogonal. Indeed, if (x, y) are cartesian
then
x = x′ cos α + y ′ sin α
Exercises 5.2
3 −4
1+2k −4k
1. If A = 1 −1 , prove by induction that Ak = k 1−2k for any posi-
tive integer k.
N
2. If M = ( −cos α sin α
sin α cos α ), prove that M = ( −cos nα sin nα
sin nα cos nα ), n ∈ N .
1 −10
3. Compute the transpose, adjoint and inverse of the matrix 0 1 −1 .
1 0 1
1 3 2 −1
4. If A = ( −2 2 ), show that A − 3A + 8I = 0. Hence compute A .
(i) AB 6= BA
(ii) (AB)t 6= AB
9. Show that every real matrix is the unique sum of a symmetric matrix
and a skew-symmetric matrix.
10. Show that every complex matrix is the unique sum of a Hermitian and
a skew-Hermitian matrix.
5.4 Groups
Groups constitute an important basic algebraic structure that occur very nat-
urally not only in mathematics but also in many other fields such as physics
and chemistry. In this section, we present the basic properties of groups. In
particular, we discuss abelian and nonabelian groups, cyclic groups, permuta-
tion groups and homomorphisms and isomorphisms of groups. We establish
Lagrange’s theorem for finite groups and the basic isomorphism theorem for
groups.
Chapter 5 Algebraic Structures 216
Definition 5.4.1:
A binary operation on a nonempty set S is a map.
∴ S × S → S, that is, for every ordered pair (a, b) of elements of S, there is
associated a unique element a · b of S. A binary system is a pair (S, ·), where
S is a nonempty set and · is a binary operation on S. The binary system
(S, ·) is associative if · is an associative operation on S, that is, for all a, b, c
in S, (a · b) · c = a · (b · c)
Definition 5.4.2:
A semi group is an associative binary system. An element e of a binary
system (S, ·) is an identity element of S if a · e = e · a = a for all a ∈ S.
Examples
Definition 5.4.3:
A group is a binary system (G, ·) such that the following axioms are satisfied:
b = b · e = b · (a · c)
= (b · a) · c by the associativity of ·
=e·c
= c.
Thus henceforth we can talk of “The identity element e” of the group (G, ·),
and “The inverse element a−1 of a” in (G, ·).
If a ∈ G, then a · a ∈ G; also, a · a · · · (n times)∈ G. We denote a · a · · · (n
times) by an . Further, if a, b ∈ G, a · b ∈ G, and (a · b)−1 = b−1 · a−1 . (Check
that (a · b)(a · b)−1 = (a · b)−1 (a · b) = e). More generally, if a1 , a2 , . . . , an ∈ G,
−1
then (a1 · a2 · · · an )−1 = a−1 −1 n −1
n · an−1 · · · aa , and hence (a ) = (a−1 )n =
(written as) a−n . Then the relation am+n = am · an holds for all integers m
and n, with a0 = e. In what follows, we drop the group operation · in (G, ·),
and simply write group G, unless the operation is explicitly needed.
Lemma 5.4.4:
In a group, both the cancellation laws are valid, that is, if a, b, c are elements
of a group G with ab = ac, then b = c (left cancellation law), and if ba = ca,
then b = c (right cancellation law).
Definition 5.4.5:
The order of a group G is the cardinality of G. The order of an element a of
a group G is the least positive n such that an = e, the identity element of G.
If no such n exists, the order of a is taken to be infinity.
1. (Z, +) is an abelian group, that is, the set Z of integers is an abelian group
under the usual addition operation. The identity element of this group is
O, and the inverse of a is −a. (Z, +) is often referred to as the additive
group of integers. Similarly, (Q, +), (R, +), (C, +) are all additive abelian
groups.
2. The sets Q∗ , R∗ and C∗ are groups under the usual multiplication opera-
tion.
1. Let G = GL(n, R), the set of all n by n nonsingular matrices with real
entries. Then G is an infinite nonabelian group under multiplication.
3. Let S4 denote the set of all 1-1 maps f : N4 → N4 , where N4 = {1, 2, 3, 4}.
If · denotes composition of maps, then (S4 , ·) is a nonabelian group of order
4! = 24. (See Section*** for more about such groups). For instance, let
123 4
f = .
412 3
Chapter 5 Algebraic Structures 221
Here the parantheses notation signifies the fact that the image under f of
a number in the top row is the corresponding number in the bottom row.
For instance, f (1) = 4, f (2) = 1 and so on. Let
123 4 123 4
g= Then g · f = .
312 4 431 2
Note that (g · f )(1) = g f (1) = g(4) = 4, while (f · g)(1) = f g(1) =
f (3) = 2, and hence f · g 6= g · f . In other words, S4 is a nonabelian group.
The identity element of S4 is the map
123 4 123 4
I= and f −1 = .
123 4 234 1
Examples Continued
This gives
e a b c
e e a b c
a a e c b
b b c e a
c c b a e
A C C
r f
B D C A B B A
Thus f r leaves B fixed and flips A and C in △ABC. There are six congruent
transformations of an equilateral triangle and they form a group as per the
following group table.
r3 = e r r2 f rf r2 f
r3 = e e r r2 f rf r2 f
r r r2 e rf r2 f f
r2 r2 e r r2 f f rf
f f fr f r2 e r2 r
rf rf f r2 f r e r2
r2 f r2 f rf f r2 r e
Group Table of the Dihedral group D3
r f r2
B C A B B A A C
A A B
f r
B C C B A C
Chapter 5 Algebraic Structures 224
mations
r4 = e = f 2 = (rf )2 .
5.7 Subgroups
Definition 5.7.1:
A subset H of a group (G, ·) is a subgroup of (G, ·) if (H) is a group, under
the operation · of G.
Examples of Subgroups
Definition 5.7.2:
LEt S be a nonempty subset of a group G. The subgroup generated by S in
G, denoted by < S >, is the intersection of all subgroups of G containing S.
Proposition 5.7.3:
The intersection of any family of subgroups of G is a subgroup of G.
Corollary 5.7.4:
Chapter 5 Algebraic Structures 226
Definition 5.8.1:
Let G be a group and a, an element of of G. Then the subgroup generated
by a in G is h{a}i, that is, the subgroup generated by the singleton subset
{a}. It is also denoted simply by hai.
m = qn + r, 0 ≤ r < n.
Then
am = a(qn+r) = (an )q ar = eq ar = ear = ar , 0 ≤ r < n,
hai = a, a2 , . . . , am−1 , am = e
2. The group of n-th roots of unity, n ≥ 1. Let G be the set of n-th roots of
unity so that
2π 2π
G = ω, ω 2 , . . . , ω n = 1; ω = cos + i sin .
n n
If G =< a >= {an : n ∈ Z}, then since for any two integers n and m,
an am = an+m = am an , G is abelian. In other words, every cyclic group is
abelian. However, the converse is not true. K4 , the Klein’s 4-group (See
Table 5.1 of Section 5.4) is abelian but not cyclic since K4 has no element of
order 4.
Theorem 5.8.2:
Any subgroup of a cyclic group is cyclic.
m = qs + r, 0 ≤ r < s.
Definition 5.8.3:
Let S be any nonempty set. A permutation on S is a bijective mapping from
S to S.
Lemma 5.8.4:
If σ1 and σ2 are permutations on S, then the map σ = σ1 σ2 defined on S by
σ2 σ1
s1 σ1 (σ2 s1 ) = σ(s1 )
σ2 σ2 (s1 ) σ1
s2 σ1 (σ2 s2 ) = σ(s2 )
σ2 (s1 )
S S S
Let B denote the set of all bijections on S. Then it is easy to verify that
(B, ·), where · is the composition map, is a group. The identity element of
this group is the identity function e on S.
Chapter 5 Algebraic Structures 230
n · (n − 1) · (n − 2) · · · 2 · 1 = n!
Example
Definition 5.8.5:
A cycle in Sn is a permutation σ ∈ Sn that can be represented in the form
(a1 , a2 , . . . , ar ), where the ai , 1 ≤ i ≤ r, r ≤ n, are all in S, and σ(ai ) = ai+1 ,
1 ≤ i ≤ r − 1, and σ(ar ) = a1 , that is, each ai is mapped cyclically to the
next element (or) number ai+1 and σ fixes the remaining ai ’s. For example,
1324
if σ = ∈ S4 , then σcan be represented by (132). Here σ leaves 4
3214
fixed.
1234567
Now consider the permutation p = . Clearly, p is the
3421657
product of the cycles
(1324)(56)(7) = (1324)(56).
(t1 t2 t3 )(4) = (t1 t2 )(t3 (4)) = (t1 t2 )(1) = t1 (t2 (1)) = t1 (2) = 2, and so on.
In the same way, any cycle (a1 a2 . . . an ) = (a1 an )(a1 an−1 ) · · · (a1 a2 ), a product
of transpositions. Since any permutation is a product of disjoint cycles and
Chapter 5 Algebraic Structures 232
Theorem 5.8.6:
Let σ be any permutation on n symbols. Then in whatever way σ is expressed
as a product of transpositions, the number of transpositions is always odd or
always even.
(a2 − a3 ) · · · (a2 − an )
... ...
(an−1 − an )
1 1 ··· 1
a1 a2 · · · an
Y
= (ai − aj ) = det a21 a22 · · · a2n .
1≤i<j≤n
··· ··· ··· ···
a1n−1 a2n−1 · · · ann−1
Any transposition (ai aj ) applied to the product P changes P to −P as this
amounts to the interchange of the i-th and j-th columns of the above de-
Chapter 5 Algebraic Structures 233
Definition 5.8.7:
A permutation is odd or even according to whether it is expressible as the
product of an odd number or even number of transpositions.
Example 5.8.8:
123456789
Let σ =
451237986
Then σ = (14253)(679)(8)
= (13)(15)(12)(14)(69)(67)
We now establish the most famous basic theorem on finite groups, namely,
Lagrange’s theorem. For this, we need the notion of left and right cosets of
a subgroup.
Definition 5.9.1:
Chapter 5 Algebraic Structures 234
Lemma 5.9.2:
Any two left cosets of a subgroup H of a group G are equipotent (that is,
have he same cardinality). Moreover, they are equipotent to H.
φ: aH → bH
Lemma 5.9.3:
The left coset aH is equal to H iff a ∈ H.
Example 5.9.4:
It is not necessary that aH = Ha for all a ∈ G. For example, consider S3 ,
Chapter 5 Algebraic Structures 235
so that aH 6= Ha.
Proposition 5.9.5:
The left cosets aH and bH are equal iff a−1 b ∈ H.
Proof. If aH = bH, then there exist h1 , h2 ∈ H such that ah1 = bh2 , and
therefore a−1 b = h1 h−1 −1 −1 −1
2 ∈ H, h2 ∈ H. If a b ∈ H, let a b = h ∈ H so
Lemma 5.9.6:
Any two left cosets of the same subgroup of a group are either identical or
disjoint.
Proof. Suppose aH and bH are two left cosets of the subgroup H of a group
G, where a, b ∈ G. If aH and bH are disjoint there is nothing to prove.
Otherwise, aH ∩bH 6= φ,and therefore, there exist h1 , h2 ∈ H with ah1 = bh2 .
Chapter 5 Algebraic Structures 236
aH = bH.
Example 5.9.7:
For the subgroup H of Example 5.9.4, we have seen that (123)H = {(123), (13)}.
Now (12)H = {(12)e, (12)(12)} = {(12), e} = H, and hence (123)H ∩
(12)H = φ. Also (23)H = {(23)e, (23)(12)} = {(23), (132)}, and (13)H =
{(13)(123), (13)((13)} = {(123), e} = (12) ∈ H
The last equation holds since (13)−1 (123) = (13)(123) = (12) ∈ H (Refer
to Proposition 5.9.5).
Theorem 5.9.8:
[Lagrange’s Theorem, after the French mathematician J. L. Lagrange]Lagrange
The order of any subgroup of a finite group G divides the order of G.
Definition 5.9.9:
Let H be a subgroup of a group G. Then the number (may be infinite) of
Chapter 5 Algebraic Structures 237
Example 5.9.10:
[An application of Lagrange’s theorem] If p is a prime, and n any positive
integer, then
n|φ(pn − 1)
Lemma 5.9.11:
If m ≥ 2 is a positive integer, and S, the set of positive integers less than m
and prime to it, then S is a multiplicative group modulo m.
every element of H has an inverse modulo m. Therefore (as the other group
axioms are trivially satisfied byH), H is a subgroup of order n of S. By
Lagrange’s theorem, o(H)|o(S), and so n|φ(pn − 1).
Theorem 5.9.12:
Any group of prime order is cyclic.
Groups
ω i ←→ i,
Definition 5.10.1:
Let G and G′ be groups (distinct or not). A homomorphism from G to G1 is
a map f : G → G′ such that
f (a + b) = f (a)f (b)
and so on.
Definition 5.10.2:
An isomorphism from a group G to a group G′ is bijective homomorphism
from G to G′ , that is, it is a map f : G → G′ which is both a bijection and a
group homomorphism.
Examples
1. Let G = (Z, +), and G′ = (nZ, +). (nZ is the set got by multiplying all
integers by n). The map f : G → G′ defined by f (m) = mn, m ∈ G, is a
group homomorphism from G onto G′ .
Proof. (i) Let f (a), f (b) ∈ f (G), where a, b ∈ G. Then f (a)f (b) =
f (ab) ∈ f (G), as ab ∈ G.
Chapter 5 Algebraic Structures 242
(iii) By Property 1, the element f (e) ∈ f (G) acts as the identity ele-
ment of f (G).
Theorem 5.11.1:
Let f : G → G′ be a group homomorphism and K = {a ∈ G : f (a) = e′ },
that is K is the set of all those elements of G that are mapped by f to the
identity element e′ of G. Then K is a subgroup of G.
Definition 5.11.2:
The subgroup K defined in the statement of Theorem 5.11.1 is called the
kernel of the group homomorphism f .
Proof.
⇔ f (ab−1 ) = e′
⇔ ab−1 ∈ K.
= (g · f ) · (g · f )(b)
= h(a)h(b).
Definition 5.12.1:
An automorphism of a group G is an isomorphism of G onto itself.
Example 5.12.2:
Let G = {ω 0 = 1, ω, ω 2 }, be the group of cube roots of unity, where ω =
cos 2π
3
+ i sin 2π
3
. Let f : G → G′ be defined by f (ω) = ω 2 . To make f a group
homomorphism, we have to set f (ω 2 ) = f (ω · ω) = f (ω)f (ω) = ω 2 · ω 2 = ω,
and f (1) = f (ω 3 ) = (f (ω))3 = (ω 2 )3 = (ω 3 )2 ) = 13 = 1. In other words, the
homomorphism f : G → G′ is uniquely defined on G once we set f (ω) = ω 2 .
Clearly, f is onto. Further, only 1 is mapped to 1 by f , while the other two
elements ω and ω 2 are moved by f . Thus Ker f = {1}. So by Property 7, f
is an isomorphism of G onto G, that is, an automorphism of G.
Our next theorem shows that there is a natural way of generating at least
one set of automorphisms of a group.
Theorem 5.12.3:
Let G be a group and a ∈ G. The map fa : G → G defined by fa (x) = axa−1
is an automorphism of G.
Chapter 5 Algebraic Structures 245
= a(xa−1 ay)a−1
= (axa−1 )(aya−1 )
= fa (x)fa (y).
Definition 5.12.4:
An automorphism of a group G that is a map of the form fa for some a ∈ G
is called an inner automorphism of G.
Definition 5.13.1:
A subgroup N of a group G is called a normal subgroup of G (equivalently,
N is normal in G) if
Proposition 5.13.2:
The normal subgroup of a groups G are those subgroups of G that are left
invariant by all the inner automorphisms of G.
The conditions (5.3) and (5.4) give the following equivalent definition of a
normal subgroup.
Definition 5.13.3:
A subgroup N of a group G is normal in G iff aN a−1 = N (equivalently,
aN = N a) for every a ∈ G.
Examples
= {(23)e(23), (23)(12)(23)}
= {e, (13)} 6= H
Definition 5.13.4:
The centre of a group G consists of those elements of G each of which
commutes with all the elements of G. It is denoted by C(G). Thus
For example, C(S3 ) = {e}, that is, the centre of S3 is trivial. Also, it is
easy to see that the centre of an abelian group G is G itself. Clearly the
trivial subgroup {e} is normal in G and G is normal in G. (Recall that
aG = G for each a ∈ G).
Proposition 5.13.5:
The centre C(G) of a group G is a normal subgroup of G.
aC(G)a−1 = aga−1 : g ∈ C(G)
= (ag)a−1 : g ∈ C(G)
= g(aa−1 ) : g ∈ C(G)
= {g : g ∈ C(G)} = C(G)
Chapter 5 Algebraic Structures 248
Theorem 5.13.6:
f : G → G′ be a group homomorphism. Then H = Ken f is a normal
subgroup of G.
aH = a1 H, and bH = b1 H (5.5)
and so (ab)−1 (a1 b1 ) = b−1 (a−1 a1 )b1 = b−1 hb1 ) (h = a−1 a1 ⊂ H).
(5.7)
Now we apply Property ???. Thus the product of two (left) cosets of H
in G is itself a left coset of H. Further for a, b, c ∈ G,
Thus the binary operation defined in G/H satisfies the associative law.
Further eH = H acts as the identity element of G/H as
(aH)(a−1 H) = (aa−1 )H = eH = H,
and for a similar reason (a−1 H)(aH) = H. Thus G/H is called a group
under this binary operation. G/H is called the quotient group or factor
group of G modulo H.
Example 5.14.1:
We now present an example of a quotient group. Let G = (R2 , +), the
additive group of points of the plane R2 . (If (x1 , y1 ) and (x2 , y2 ) are two
points of R2 , their sum (x1 , y1 )+(x2 , y2 ) is defined as (x1 +x2 , y1 +y2 ). The
identity element of this group (0, 0) and the inverse of (x, y) is (−x, −y)).
Let H be the subgroup: {(x, 0) : x ∈ R} = X-axis. If (a, b) is any point
of R2 , then
(0, b + b′ )
(a + x, b′ )
(0, b′ )
(a, b) (a + x, b)
(0, b)
z x
}| {
X
O (x, 0)
Figure 5.15
= φ(g1 K)φ(g2 K)
Let us see as to what this isomorphism means with regard to the factor
group R2 /H given in Example 5.14.1. Define f : R2 → R by f (a, b) =
(0, b), the projection of the point (a, b) ∈ R2 on the Y-axis= R. The
identity element of the image group is the origin (0, 0). Clearly K is the
set of all points (a, b) ∈ R2 that are mapped to (0, 0), that is, the set
of those points of R2 whose projections on the Y-axis coincide with the
origin. Thus K is the X-axis(= R). Now φ : G/K = R2 /R → G′ is
defined by φ((a + b) + K) = f (a, b) = (0, b). This means that all points of
the line through (a, b) parallel to the X-axis are mapped to their common
projection on the Y-axis, namely, the point (0, b). Thus the isomorphism
between G/K and G′ is obtained by mapping each line parallel to the
X-axis to the point where the line meets the Y-axis.
Chapter 5 Algebraic Structures 253
5.16 Exercises
2. Let G denote the set o fall real matrices of the form ( ab 01 ) with a 6= 0.
Show that G is a group under matrix multiplication.
Chapter 5 Algebraic Structures 254
(i). (Q, ·)
(ii). (R∗ , ·)
(iii). (Q, +)
(iv). (R∗ , ·)
8. Prove that any group of even order has an element of order 2. (Hint:
o(a) 6= 2 iff a 6= a−1 . Pair of such elements (a, a−1 ).
Chapter 5 Algebraic Structures 255
10. Show that no group can be the set union of two of its proper subgroups.
21. Show that any infinite cyclic group is isomorphic to (Z, +).
22. Show that the set {ein : n ∈ Z} forms a multiplicative group. Show that
this is isomorphic to (Z, +). Is this group cyclic?
24. Give an example of a group that is isomorphic to one of its proper sub-
groups.
25. Prove that (Z, +) is not isomorphic to (Q, +). (Hint: Suppose ∃ an iso-
morphism φ : Z → Q. Let φ(5) = a ∈ Q. Then ∃ b ∈ Q with 2b = a. Let
x ∈ Z be the preimage of b. Then 2x = 5 in Z, which is not true.)
26. Prove that the multiplicative groups R and C are not isomorphic.
28. Give the group table of the group S3 . From the table, find the centre of
S3 .
31. Let G be a group. Let [G, G] denote the subgroup of G generated by all
elements of G of the form aba−1 b−1 (called the commutator of a and b)
for all pair of elements a, b ∈ G. Show that [G, G] is a normal subgroup of
G. [Hint: For c ∈ G, we have c(aba−1 b−1 )c−1 = (cac−1 )(cbc−1 )(cc−1 )−1 ∈
[G, G]. Now apply Exercise 29.]
Chapter 5 Algebraic Structures 257
33. Let G be the set of all roots of unity, that is, G = {ω ∈ C : ω n = 1 for
some n ∈ N}. Prove that G is an abelian group that is not cyclic.
35. If H is the only subgroup of a given finite order in a group G, show that
H is normal in G.
38. Prove that the subgroup {e, (123), (132)} is a normal subgroup of S3 .
5.17 Rings
Definition 5.17.1:
A ring is a set A with two binary operations, denoted by + and · (called
addition and multiplication respectively) satisfying the following axioms:
R3 : For all a, b, c ∈ A,
Examples of Rings
1. A = Z, the set of all integers with the usual addition + and the usual
multiplication taken as ·.
2. A = 2Z, the set of even integers with the usual addition and multipli-
cation.
Definition 5.17.2:
A ring A is called commutative if for all a, b ∈ A, ab = ba.
Definition 5.17.3:
An element e of a ring A is called a unity element of A if ea = ae = a for all
a ∈ A.
A unity element of A, if it exists, must be unique. For, if e and f are
unity elements of A, then,
ef = f as e is a unity element of A.
Proposition 5.17.4:
If a is a unit in a ring A with unity element e, and if ab = ca = e, then b = c.
Proposition 5.17.5:
The units of a ring A (with identity element) form a group under multipli-
cation.
Proof. Exercise.
Let a be a unit in the ring Zn . (See Example 5.17.1 above). Then there
exists an x ∈ Zn such that ax = 1 in Zn , or equivalently, ax ≡ 1( mod n).
But this implies that ax − 1 = bn for some integer b. Hence (a, n) = the
gcd of a and n = 1. (Because if an integer c > 1 divides both a and n, then
it should divide 1). Conversely, if (a, n) = 1, by Euclidean algorithm (see
Section ??), there exist integers x and y with ax + ny = 1, and therefore
ax ≡ 1( mod n). This however means that a is a unit in Zn . Thus the set
U of units of Zn consists precisely of those integers in Zn , that are relatively
prime to n. By Definition 3.7.2, |U | is φ(n), where φ is the Euler function.
Zero Divisors
Definition 5.17.6:
A left zero divisor in a ring A is a non-zero element a of A such that there
exists a non-zero element b of A with ab = 0 in A. a ∈ A is a right zero
divisor in A if ca = 0 for some c ∈ A, c 6= 0.
Examples
[ 00 01 ] [ 10 10 ] = [ 00 00 ]
Theorem 5.17.7:
The following statements are true for any ring A.
Proof. Exercise.
Definition 5.18.1:
An integral domain A is a commutative ring with unity element having no
divisors of zero.
Examples
5.19 Exercises
5. Let A be a ring, and a, b ∈ A. Then show that for any positive integer
n,
n(ab) = (na)b = a(nb).
(i) Z is a subring of Q.
(ii) Q is a subring of R.
(iii) R is a subring of C.
8. Prove that any ring A with identity element and cardinality p, where
p is a prime, is commutative. (Hint: Verify that the elements 1, 1 +
1, . . . , 1 + 1 + · · · + 1 (p times) are all distinct elements of A).
5.20 Fields
If rings are algebraic abstractions of the set of integers, fields are algebraic
abstractions of the sets Q, R and C (as mentioned already).
Definition 5.20.1:
A field is a commutative ring with unity element in which every non-zero
element is a unit.
Every field is an integral domain. To see this all that we have to verify
is that F has no zero divisors. Indeed, if ab = 0, a 6= 0, then a−1 exists in F
and so we have 0 = a−1 (ab) = (a−1 a)b = b in F . However, not every integral
domain is a field. For instance, the ring Z of integers is an integral domain
but not a field. (Recall that the only non-zero integers which are units are 1
and −1.)
Chapter 5 Algebraic Structures 266
Definition 5.21.1:
A field F is called finite if |F |, the cardinality of F , is finite; otherwise, F is
an infinite field.
Let F be a field whose zero and unity elements are denoted by 0F and
1F respectively. A subfield of F is a subset F ′ of F such that F ′ is also
a field with the same addition and multiplication operations of F . This of
course means that the zero and unity elements of F ′ are the same as those
of F . It is clear that the intersection of any family of subfields of F is again
a subfield of F . Let P denote the intersection of the family of all subfields
of F . Naturally, the subfield P is the smallest subfield of F . Because if
P ′ is a subfield of F that is properly contained in P , then P ⊂ P ′ ⊂ P , a
6=
contradiction. This smallest subfield P of F is called the prime field of F .
Necessarily, 0F ∈ P and 1F ∈ P .
As 1F ∈ P , the elements 1F , 1F + 1F = 2 · 1F , 1F + 1F + 1F = 3 · 1F and, in
general, n · 1F , n ∈ N, all belong to P . There are then two cases to consider:
Case 1: The elements n·1F , n ∈ N, are all distinct. In this case, the subfield
P itself is an infinite field and therefore F is an infinite field.
Case 2: The elements n · 1F , n ∈ N, are not all distinct. In this case,
there exist r, s ∈ N with r > s such that r · 1F = s · 1F , and therefore,
(r − s) · 1F = 0, where r − s is a positive integer. Hence there exists a least
positive integer p such that p · 1F = 0. We claim that p is a prime number.
If not, p = p1 p2 , where p1 and p2 are positive integers less than p. Then
0 = p · 1F = (p1 p2 ) · 1F = (p1 · 1F )(p2 · 1F ) gives, as F is a field, either
Chapter 5 Algebraic Structures 267
Definition 5.21.2:
The characteristic of a field F is the least positive integer p such that p·1F = 0
if such a p exists; otherwise, F is said to be of characteristic zero.
A field of characteristic zero is necessarily infinite (as its prime field al-
ready is). A finite field is necessarily of prime characteristic. However, there
are infinite fields with prime characteristic. Note that if a field F has char-
acteristic p, then px = 0 for each x ∈ F .
Examples
(iii) For a field F , denote by F [X] the set of all polynomials in X over F ,
that is, polynomials whose coefficients are in F . F [X] is an integral
domain and the group of units of F [X] = F ∗ .
a(X)
(iv) The field Zp (X) of rational functions of the form , where a(X)
b(X)
and b(X) are polynomials in X over Zp , and b(X) 6= 0, is an infinite
field of (finite) characteristic p.
Theorem 5.21.3:
n
Let F be a field of (prime) characteristic p. Then for all x, y ∈ F , (x ± y)p =
n n n n n
xp ± y p , and (xy)p = xp y p .
Chapter 5 Algebraic Structures 268
So assume that
n n n
(x + y)p = xp + y p .
n+1
n p
Then (x + y)p = (x + y)p
n n p
= xp + y p by induction assumption
n n
= (xp )p + (y p )p (by (5.1))
n+1 n+1
= xp + yp (5.2)
n
Next we consider (x − y)p . If p = 2, then −y = y and so the result is valid.
If p is an odd prime, change y to −y in (5.2). This gives
n n n
(x − y)p = xp + (−y)p
n n n
= xp + (−1)p y p
n n
= xp − y p ,
n
since (−1)p = −1.
Definition 5.22.1:
A vector space (or linear space) V over a field F is a nonvoid set V whose
elements satisfy the following axioms:
(B) For every pair of elements α and v, where α ∈ F and v ∈ V , there exists
an element αv ∈ V called the product of v by α such that
dn y dn−1 y
+ C1 + · · · + Cn−1 y = 0. (1)
dxn dxn−1
Clearly, if y1 (x) and y2 (x) are two solutions of the differential equa-
tion (1), then so is y(x) = α1 y1 (x) + α2 y2 (x), α1 , α2 ∈ R. It is now easy
to verify that the axioms of a vector space are satisfied.
5.23 Subspaces
Definition 5.23.1:
A subspace W of a vector space V over F is a subset W of V such that W is
also a vector space over F with addition and scalar multiplication as defined
for V .
Proposition 5.23.2:
A non-void subset W of a vector space V is a subspace of V iff for all u, v ∈ W
and α, β ∈ F ,
αu + βv ∈ W
An example of a subspace
Proposition 5.23.3:
If W1 and W2 are subspaces of a vector space V , then W1 ∩ W2 is also a
subspace of V . More generally, the intersection of any family of subspaces of
a vector space V is also a subspace of V .
Definition 5.24.1:
Let S be a subset of a vector space V over F . By the subspace spanned by
S, denoted by < S >, we mean the smallest subspace of V that contains S.
If < S >= V , we call S a spanning set of V .
Example 5.24.2:
We shall determine the smallest subspace of R3 containing the vectors (1, 2, 1)
and (2, 3, 4).
Clearly, W must contain the subspace spanned by (1, 2, 1), that is, the line
joining the origin (0, 0, 0) and (1, 2, 1). Similarly, W must also contain the
line joining (0, 0, 0) and (2, 3, 4). These two distinct lines meet at the origin
and hence define a unique plane through the origin, and this is the subspace
spanned by the two vectors (1, 2, 1) and (2, 3, 4). (See Proposition 5.24.3
below.
Proposition 5.24.3:
Let S be a subset of a vector space V over F . Then < S >= L(S), where
L(S) = α1 s1 + α2 s2 + · · · + αr sr : si ∈ S, 1 ≤ i ≤ r and αi ∈ F, 1 ≤ i ≤
r, r ∈ N = set of all finite linear combinations of vectors of S over F .
u = α1 s1 + · · · + αr sr , and
v = β1 s′1 + · · · + βt s′t
Proposition 5.24.4:
Let u1 , . . . , un and v be vectors of a vector space V . Suppose that v ∈ <
u1 , u2 , . . . , un >. Then < u1 , . . . , un > = < u1 , . . . , un ; v >.
Conversely, if
w = α1 u1 + · · · + αn un + βv ∈< u1 , . . . , un ; v >,
Chapter 5 Algebraic Structures 275
w = (α1 u1 + · · · + αn un ) + β (γ1 u1 + · · · + γn un )
n
X
= (αi + βγi ) ui ∈< u1 , . . . , un > .
i=1
Corollary 5.24.5:
If S is any nonempty subset of a vector space V , and v ∈< S >, then
< S ∪ {v} > = < S >.
0 · v1 + 0 · v2 + · · · + 0 · vn = 0
In this case we also say that the vectors v1 , . . . , vn are linearly indepen-
dent over F . In the above equation, the zero on the right refers to the
zero vector of V while the zeros on the left refer to the scalar zero, that
is, the zero element of F .
α1 v1 + · · · + αn vn = 0.
Remark 5.25.2: (i) The zero vector of V forms a linearly dependent set
since it satisfies the nontrivial equation 1 · 0 = 0, where 1 ∈ F and
0∈V.
(ii) Two vectors V are linearly dependent over F iff one of them is a scalar
multiple of the other.
Chapter 5 Algebraic Structures 277
Proposition 5.25.3:
Any subset T of a linearly independent set S of a vector space is linearly
independent.
α1 v1 + · · · αr vr = 0, αi ∈ F
(α1 v1 + · · · αr vr ) + (0 · vr+1 + · · · 0 · vn ) = 0.
Corollary 5.25.4:
If v ∈ L(S), then S ∪ {v} is linearly dependent.
Chapter 5 Algebraic Structures 278
Examples
α·1+β·i=0
and giving
2u − 3v − w = 0.
λ1 X i1 + λ2 X i2 + · · · + λn X in = 0 (A)
Chapter 5 Algebraic Structures 279
Definition 5.26.1:
A basis (or base) of a vector space V over a field F is a subset B of V such
that
u = α1 u1 + α2 u2 + 0 · u′1 + 0 · u′2
= 0 · u1 + 0 · u2 + αu′1 + αu′2 .
Chapter 5 Algebraic Structures 280
Example 5.26.2:
The vectors e1 = (1, 0, 0), e2 = (0, 1, 0) and e3 = (0, 0, 1) form a basis for R3 .
This follows from the following two facts.
and hence αi = 0, 1 ≤ i ≤ 3.
Definition 5.27.1:
By a finite-dimensional vector space, we mean a vector space that can be
generated (or spanned) by a finite number of vectors in it.
Lemma 5.27.2:
No finite-dimensional vector space can have an infinite basis.
Lemma 5.27.3:
A finite sequence {v1 , . . . , vn } of non-zero vectors of a vector space V is
linearly dependent iff for some k, 2 ≤ k ≤ n, vk is a linear combination of its
preceding vectors.
Proof. In one direction, the proof is trivial; if vk ∈< v1 , . . . , vk−1 >, then
by Proposition 5.25.3, {v1 , . . . vk−1 ; vk } is linearly dependent and so is its
superset {v1 , . . . , vn }.
Conversely, assume that {v1 , v2 , . . . , vn } is linearly dependent. As v1 is
a non-zero vector, {v1 } is linearly independent (See (iii) of Remark 5.25.2).
Hence there must exist a k, 2 ≤ k ≤ n such that {v1 , . . . , vk−1 } is linearly
independent while {v1 , . . . , vk } is linearly dependent since at worst k can be
n. Hence there exists a set of scalars α1 , . . . , αk , not all zero, such that
α1 v1 + · · · + αk vk = 0.
Lemma 5.27.3 implies, by Proposition 5.24.4, that under the stated con-
ditions on vk ,
∧
< v1 , . . . , vk , . . . , vn >=< v1 , . . . , vk−1 , vk+1 , . . . , vn >=< v1 , . . . , vk , . . . , vn >,
where the ∧ symbol upon vk indicates that the vector vk should be deleted.
We next prove a very important property of finite-dimensional vector
spaces.
Theorem 5.27.4:
Any finite-dimensional vector space has a basis. Moreover, any two bases of
a finite-dimensional vector space have the same number of elements.
∧ ∧
= {v1 , v2 ; u1 , . . . , ui1 , . . . ui2 , . . . , um }/ ui1 , ui2 ,
< S3 > = V.
Note that we have actually shown that any finite spanning subset of a
finite-dimensional vector space V does indeed contain a finite basis of V .
Theorem 5.27.4 makes the following definition unambiguous.
Definition 5.27.5:
The dimension of a finite-dimensional vector space is the number of elements
in any one of its bases.
Examples
2. Cn is of dimension n over C.
Chapter 5 Algebraic Structures 284
S = {e1 , . . . , en ; f1 , . . . fn }
4. Let Pn (X) denote the set of real polynomials in X with real coefficients
of degrees not exceeding n. Then B = {1, X, X 2 , . . . , X n } is a basis for
Pn (X). Hence dimR Pn (X) = n + 1.
Proposition 5.27.6:
Any maximal linearly independent subset of a finite-dimensional vector space
V is a basis for V .
5.28 Exercises
2. If n ∈ N, show that the set of all real polynomials of degree n does not
form a vector space over R (under usual addition and scalar multipli-
cation of polynomials).
4. Show that the dimension of the vector space of all m by n real matrices
over R is mn. [Hint: For m = 2, n = 3, the matrices
h i h i h i h i h i h i
100 010 001 000 000 000
000 , 000 , 000 , 100 , 010 , 001
form a basis for the space of all 2 by 3 real matrices. Verify this first].
Chapter 5 Algebraic Structures 286
(iv) In a vector space, any two generating (that is, spanning) subsets
are disjoint.
of a Matrix
Let
a11 a12 . . . a1n
a21 a22 . . . a2n
A = .. .. ..
. . .
am1 am2 . . . amn
be an m by n matrix over a field F . To be precise, we take F = R, the field
of real numbers. Let R1 , R2 , . . . , Rm be the row vectors and C1 , C2 , . . . , Cn
the column vectors of A. Then each Ri ∈ Rn and each Cj ∈ Rm . The row
space of A is the subspace < R1 , . . . , Rm > of Rn , and its dimension is the
row rank of A. Clearly, (row rank of A) ≤ m since any m vectors of a vector
space span a subspace of dimension at most m. The column space of A and
the column rank of A (≤ n) are defined in an analogous manner.
We now consider three elementary row transformations (or operations)
defined on the row vectors of A:
(iii) Ri + cRj —addition to the i-th row of A, c times the j-th row of A, c
being a scalar.
(i) The leading non-zero entry of any non-zero row (if any) of A∗ is 1.
(ii) The leading 1’s in the non-zero rows of A∗ occur in increasing order of
their columns.
Now let D be a square matrix of order n. The three elementary row (respec-
tively column) operations considered above do not change the singular or
nonsingular nature of D. In other words, if D∗ is a row-reduced echelon form
of D, then D is singular iff D∗ is singular. In particular, D is nonsingular iff
D∗ = In , the identity matrix of order n. Hence if a row-reduced echelon form
A∗ of a matrix A has r non-zero rows, the maximum order of a nonsingular
square submatrix of A is r. This number is called the rank of A.
Definition 5.29.1:
The rank of a matrix A is the maximum order of a nonsingular square sub-
matrix of A. Equivalently, it is the maximum order of a nonvanishing deter-
minant minor of A.
Chapter 5 Algebraic Structures 289
Example 5.29.2:
Find the row-reduced echelon form of
12 3 −1
2 1 −1 4
A = 3 3 2 3 .
66 4 6
As the leading entry of R1 is 1, we perform the operations R2 −2R1 ; R3 −3R1 ;
R4 − 6R1 (where Ri stands for the i-th row of A). This gives
1 2 3 −1
0 −3 −7 6
A1 = 0 −3 −7 6 .
0 −6 −14 12
Next perform 31 R2 (that is, replace R2 by 31 R2 ). This gives
1 2 3 −1
0 1 7/3 −2
A′1 = 0 −3 −7 6 .
0 −6 −14 12
Now perform R1 − 2R2 (that is, replace R1 by R1 − 2R2 etc.); R3 + 3R2 ;
R4 + 6R2 . This gives the matrix
1 0 −5/3 3
0
A2 = 0 1 7/3 −2 .
0 0 0
0 0 0 0
A2 is the row-reduced echelon form of A. Note that A2 is uniquely determined
by A. Since the maximum order of a non-singular submatrix of A2 is 2, rank
of A = 2. Moreover, row space of A2 =< R1 , R2 (of A2 ) >. Clearly R1 and
R2 are linearly independent over R since for α1 , α2 ∈ R, α1 R1 + α2 R2 =
(α1 , α2 , −5α1 /3 + 7α2 /3, 3α1 − 2α2 ) = 0 = (0, 0, 0, 0) implies that α1 = 0 =
α2 . Thus the row rank of A is 2 and therefore the column rank of A is also
2.
Remark 5.29.3:
Since the last three rows of A1 are proportional (that is, one row is a multiple
Chapter 5 Algebraic Structures 290
X1 + 2X2 + 3X3 − X4 = 0
where X
1 2 3 −1 1 0
2 1 −1 4 X2 0
A = 3 3 2 3 ,
X= X , and 0 = 0 .
3
6 6 4 6 X4 0
If X1 and X2 are any two solutions of (5.4), then so is aX1 + bX2 for scalars
a and b since A(aX1 + bX2 ) = a(AX1 ) + b(AX2 ) = a · 0 + b · 0 = 0. Thus the
set of solutions of (5.4) is (as X ∈ Rn ) a vector subspace of Rn , where n =
the number of indeterminates in the equations (5.3).
It is clear that the three elementary row operations performed on a sys-
tem of homogeneous linear equations do not alter the set of solutions of the
Chapter 5 Algebraic Structures 291
Theorem 5.30.1:
The solution space of a system of homogeneous linear equations is of dimen-
sion n−r, where n is the number of unknowns and r is the rank of the matrix
A of coefficients.
Chapter 5 Algebraic Structures 292
tions
AX = B,
X1 − X2 + X3 = 2
X1 + X2 − X3 = 0
3X1 = 6.
From the last equation, we get X1 = 2. This, when substituted in the first
two equations, yields −X2 + X3 = 0, X2 − X3 = −2 which are mutually
contradictory. Such equations are called inconsistent equations.
When are the equations represented by AX = B consistent?
Theorem 5.31.1:
The equations AX = B are consistent if and only if B belongs to the column
space of A.
Chapter 5 Algebraic Structures 293
α1
Proof. The equations are consistent iff there exists a vector X0 = ... such
αn
that AX0 = B. But this happens iff α1 C1 +· · ·+αn Cn = B, where C1 , . . . , Cn
are the column vectors of A, that is, iff B belongs to the column space of
A.
Corollary 5.31.2:
The equations represented by AX = B are consistent iff rank of A = rank
of (A, B). [(A, B) denotes the matrix obtained from A by adding one more
column vector B at the end. It is called the matrix augmented by B].
Theorem 5.31.3:
Let X0 be any particular solution of the equation AX = B. Then, the set of
all solutions of AX = B is given by {X0 + U }, where U varies over the set
of solutions of the auxilliary equation AX = 0.
Definition 5.32.1:
By an LUP decomposition of a square matrix A we mean an equation of the
form
P A = LU (5.5)
LU X = b′ . (5.6)
= LU, (5.7)
t
1 0 a11 w
where L = , and U = . The validity of the two middle
′ ′
v/a11 L 0 U
equations on the right of (5.7) can be verified by routine block multiplication
of matrices (See ???). This method is based on the supposition that a11
and all the leading entries of the successive Schur complements are all non-
zero. If a11 is zero, we interchange the first row of A with a subsequent row
having a non-zero first entry. This amounts to premultiplying both sides
by the corresponding permutation matrix P yielding the matrix P A on the
left. We now proceed as with the case when a11 6= 0. If a leading entry of
a subsequent Schur complement is zero, once again we make interchanges
of rows—not just the rows of the relevant Schur complement but the full
rows got from A. This again amounts to premultiplication by a permutation
matrix. Since any product of permutation matrices is a permutation matrix,
this process finally ends up with a matrix P ′ A, where P ′ is a permutation
matrix of order n.
We now present two examples, one to obtain the LU decomposition when
it is possible and another to determine the LUP decomposition.
Example 5.32.2:
2 3 1 2
Find the LU decomposition of A = 4 7 4 7 .
2 7 13 16
h i 6 10 13 15
4
Here a11 = 2, v = 2 , wt = [3, 1, 2].
h i6 h i
2 6 2 4
Therefore v/a11 = 1 , and so vwt /a11 = 3 1 2 , where wt denotes the
3 9 3 6
transpose of w.
h i h i h i
7 4 7 6 2 4 1 2 3
Hence the Schur complement of A is A1 = 7 13 16 − 3 1 2 = 4 12 14 .
10 13 15 9 3 6 1 10 9
A2 = [ 12 14 4
10 9 ] − [ 1 ] [
2, 3 ] = [ 12 14 ] − [ 8 12 ] = [ 4 2 ].
10 9 2 3 8 6
= (1)(2) = L3 U3 ,
Example 5.32.3:
2 3 1 2
4 6 4 7
Find the LUP decomposition of A = 2 7 13 16 .
6 10 13 15
Chapter 5 Algebraic Structures 298
Example 5.32.4:
Solve the system of linear equations
Y1 = 10
3Y1 + Y2 = 50
Y1 + 4Y2 +Y3 = 40
1
2Y1 − Y3 +Y4 = 25,
14
X2 + 10X3 + 9X4 = 20
5.33 Exercises
X1 + X2 + X3 + X4 = 0
X1 + X2 + 2X3 − 3X4 = 0
2X1 + 2X2 − X3 + X4 = 0
X1 + X2 + 2X3 − 3X4 = 0
3. Solve:
X1 + X2 + X3 + X4 = 0
2X1 + X3 − X4 = 0
(i)
4X1 + X2 − X3 + 3X4 = 10
(ii)
3X1 − 2X2 + X3 = 7
X1 + X2 + X3 = 12
−X1 + 4X2 − X3 = 3
(iii)
4X1 + −5X3 − X4 = 16
−4X1 + 2X2 + X4 = −5
In this section, we discuss the basic properties of finite fields. Finite fields
are fundamental to the study of codes and cryptography.
Recall that a field F is finite if |F | is finite. |F | is the order of F . The
characteristic of a finite field F , as seen in Section 5.21, is a prime number
and the prime field P of F is a field of p elements. P consists of the p
elements 1F , 2 · 1F = 1F + 1F , . . . , p · 1F = 0F . Clearly, F is a vector space
over P . If the dimension of F over P is n, then n is finite. Hence F has a
basis {u1 , . . . , un } of n elements over P . This means that each element v ∈ F
is a unique linear combination of u1 , . . . , un , say,
v = α1 u1 + α2 u2 + · · · + αn un , αi ∈ P, 1 ≤ i ≤ n.
Chapter 5 Algebraic Structures 303
Theorem 5.34.1:
The order of a finite field is a power of a prime number.
Finite fields are known as Galois fields after the French mathematician
Évariste Galois (1811–1832) who first studied them. A finite field of order q
is denoted by GF (q).
We now look at the converse of Theorem 5.34.1. Given a prime power pn
(where p is a prime), does there exist a field of order pn ? The answer to this
question is in the affirmative. We give below two different constructions that
yield a field of order pn .
Theorem 5.34.2:
Given pn (where p is a prime), there exists a field of pn elements.
n
Construction 1: Consider the polynomial X p − X ∈ Zp [X] of degree pn .
(Recall that Zp [X] stands for the ring of polynomials in X with coefficients
from the field Zp of p elements). The derivative of this polynomial is
n −1
pn X p − 1 = −1 ∈ Zp [X],
n
and is therefore relatively prime to it. Hence the pn roots of X p − X are
all distinct. (Here, though no concept of the limit is involved, the notion of
the derivative has been employed as though it is a real polynomial). It is
known [28] that the roots of this polynomial lie in an extension field K ⊃ Zp .
Chapter 5 Algebraic Structures 304
n
K is also of characteristic p. If a and b any two roots of X p − X, then
n n
ap = a, and bp = b.
n n n
(a ± b)p = ap ± bp ,
n n n
ap bp = (ab)p ,
n
and so a ± b and ab are also roots of X p − X. Moreover, if a is a non-zero
n n n
root of X p − X, then so is a−1 since (a−1 )p = (ap )−1 = a−1 . Also the
associative and distributive laws are valid for the set of roots since they are
all elements of the field K. Finally 0 and 1 are also roots. In other words,
n
the pn roots of X p − X ∈ Zp [X] form a field of order pn .
Construction 2: Let f (X) = X n +a1 X n−1 +· · · an ∈ Zp [X] be a polynomial of
degree n irreducible over Zp The existence of such an irreducible polynomial
(with leading coefficient 1) of degree n is guaranteed by a result (see [5]) in
Algebra . Let F denote the ring of polynomials in Zp [X] reduced modulo
f (X) (that is, if g(X) ∈ Zp [X], divide g(X) by f (X) and take the remainder
g1 (X) which is 0 or of degree less than n). Then every non-zero polynomial
in F is a polynomial of Zp [X] of degree at most n − 1. Moreover, if a0 X n−1 +
· · · + an and b0 X n−1 + · · · + bn are two polynomials in F of degrees at most
n − 1, and if they are equal, then,
a0 X n−1 + · · · + an ∈ Zp [X]
is pn .
We now show that F is a field. Clearly, F is a commutative ring with
unit element 1(= 0 · X n−1 + · · · + 0 · X + 1). Hence we need only verify that
if a(X) ∈ F is not zero, then there exists b(X) ∈ F with a(X)b(X) = 1. As
a(X) 6= 0, and f (X) is irreducible over Zp , the gcd (a(X), f (X)) = 1. So by
Euclidean algorithm (Section 3.3), there exist polynomials C(X) and g(X)
in Zp [X] such that
a(X)C(X) + f (X)g(X) = 1 (5.9)
in Zp [X]. Now there exists C1 (X) ∈ F with C1 (X) ≡ C(X)( mod f (X)).
This means that there exist a polynomial h(X) in Zp [X] with C(X) −
C1 (X) = h(X)f (X), and hence C(X) = C1 (X) + h(X)f (X). Substitut-
ing this in (1) and taking modulo f (X), we get, a(X)C1 (X) = 1 in F . Hence
a(X) has C1 (X) as inverse in F . Thus every non-zero element of F has a
multiplicative inverse in F , and so F is a field of pn elements.
We have constructed a field of pn elements in two different ways—one,
n
as the field of roots of the polynomial X p − X ∈ Zp [X], and the other, as
the field of polynomials in Zp [X] reduced modulo the irreducible polynomial
f (X) of degree n over ZP . Essentially, there is not much of a difference
between the two constructions, as our next theorem shows.
Theorem 5.34.3:
Any two finite fields of the same order are isomorphic under a field isomor-
Chapter 5 Algebraic Structures 306
phism.
Example 5.34.4:
Take p = 2 and n = 3. The polynomial X 3 + X + 1 of degree 3 is irreducible
over Z2 . (If it is reducible, one of the factors must be of degree 1, and it
must be either X or X + 1 = X − 1 ∈ Z2 [X]. But 0 and 1 are not roots
of X 3 + X + 1 ∈ Z2 [X]). The 23 = 8 polynomials over Z2 reduced modulo
X 3 + X + 1 are:
0, 1, X, X + 1, X 2 , X 2 + 1, X 2 + X, X 2 + X + 1
Theorem 5.34.5:
If F is a finite field, F ∗ (the set of non-zero elements of F ) is a cyclic group.
l
1, we have o(αβ (k,l) ) = o(α) · o(β (k,l) ) = k ·
= [k, l], the lcm of k and l.
(k, l)
But, by our choice, the maximum order of any element of F ∗ is k. Therefore
[k, l] = k which implies that l|k. But l = o(β). Therefore β k = 1. Thus for
each of the q − 1 elements of x of F ∗ , xk = 1 and so is a root of xk − 1. This
means that, as |F ∗ | = q − 1, k = q − 1. Thus o(α) = q − 1 and so F ∗ is the
cyclic group generated by α.
a0 + a1 α + · · · + an−1 αn−1 , ai ∈ P
Chapter 5 Algebraic Structures 308
Example 5.34.7:
Consider the polynomial X 4 + X + 1 ∈ Z2 [X]. This is irreducible over Z2
(Check that it can have no linear or quadratic factor in Z2 [X]). Let α be a
root (in an extension field of Z2 ) of this polynomial so that α4 + α + 1 = 0.
This means that α4 = α + 1.
We now prove that α is a primitive element of a field of 16 elements over
Z2 by checking that the 15 powers α, α2 , . . . , α15 are all distinct and that
α15 = 1. Indeed, we have
α1 = α
α2 = α2
α3 = α3
α4 = α + 1
α5 = αα4 = α(α + 1) = α2 + α
α6 = αα5 = α3 + α2
α7 = αα6 = α4 + α3 = α3 + α + 1
α9 = αα8 = α3 + α
α10 = αα9 = α4 + α2 = α2 + α + 1
Chapter 5 Algebraic Structures 309
α11 = αα10 = α3 + α2 + α
= α3 + α2 + 1
α15 = αα14 = α4 + α = (α + 1) + α = 1
Fields
n
(αi )p = αi .
Chapter 5 Algebraic Structures 310
t+1
This shows that there exists a least positive integer t such that αi·p = αi .
Then set
Ci = i, pi, p2 i, . . . , pt i , 0 ≤ i ≤ pn − 1.
The sets Ci are called the cyclotomic cosets modulo p defined with respect
to F and α. Now, corresponding to the coset Ci , 0 ≤ i ≤ pn − 1, consider
the polynomial
2 t
fi (X) = (X − αi )(X − αi·p )(X − αi·p ) · · · (X − αi·p ).
t
The coefficients of fi are elementary symmetric functions of αi , αip , . . . , αip
and if β denotes any of these coefficients, then β satisfies the relation β p = β.
Hence β ∈ Zp and fi (X) ∈ Zp [X] for each i, 0 ≤ i ≤ pn − 1. Each element
of Ci determines the same cyclotomic coset, that is, Ci = Cip = Cip2 = · · · =
n
/ Ci , Ci ∩ Cj = φ. This gives a factorization of X p − X
Cipt . Moreover, if j ∈
n n −1
into irreducible factors over Zp . In fact, X p − X = X(X p − 1), and
n −1 n −1
Xp − 1 = (x − α)(X − α2 ) · · · (X − αp )
!
Y Y
= (X − αj ) ,
i j∈Ci
where the first product is taken over all the distinct cyclotomic cosets. What
is more, each polynomial fi (X) is irreducible over Zp as shown below. To see
this, assume that
g(X) = a0 + a1 X + · · · + ak X k ∈ F [X].
p
Then g(X) = ap0 + ap1 X p + · · · + apk (X k )p
= a0 + a1 X p + · · · + ak X kp
= g(X p ).
Chapter 5 Algebraic Structures 311
Example 5.35.1:
4
We factorize X 2 − X into irreducible factors over Z2 .
Let α be a primitive element of the field GF (24 ). As a primitive polyno-
mial of degree 4 over Z2 having α as a root, we can take (See Example 5.34.7)
X 4 + X + 1.
The cyclotomic cosets modulo 2 w.r.t. GF (24 ) and α are:
C0 = {0}
C1 = 1, 2, 22 = 4, 23 = 8 (Note: 24 = 16 ≡ 1( mod 15))
C3 = {3, 6, 12, 9}
C5 = {5, 10}
= X 2 + X + 1.
The six factors on the right of Equation (5.10) are all irreducible over Z2 . The
minimal polynomials of α, α3 and α7 are all of degree 4 over Z2 . However,
while α and α7 are primitive elements of GF (24 ) (so that the polynomials
X 4 +X +1 and X 4 +X 3 +1 are primitive), α3 is not (even though its minimal
polynomial is also of degree 4).
5.35.1 Exercises
3 5
3. Factorize X 2 + X and X 2 + X over Z2 .
Chapter 5 Algebraic Structures 313
2
4. Factorize X 3 − X over Z3 .
5. Using Theorem 5.34.5, prove Fermat’s little Theorem that for any prime
p, ap−1 ≡ 1 ( mod p), for a 6≡ 0 ( mod p).
dered pairs (1, 1), (2, 2), (3, 3); (2, 3), (3, 1), (1, 2); (3, 2), (1, 3), (2, 1) are all
distinct. However if M1 = [ 12 21 ] and M2 = [ 21 12 ], then the 4 ordered pairs
(1, 2), (2, 1), (2, 1) and (1, 2) are not all distinct. Hence M1 and M2 are not
orthogonal. The study of orthogonal latin squares started with Euler, who
had proposed the following problem of 36 officers. The problem asks for
an arrangement of 36 officers of 6 ranks and from 6 regiments in a square
formation of size 6 by 6. Each row and column of this arrangement are to
contain only one officer of each rank and only one officer from each regiment.
Chapter 5 Algebraic Structures 314
We label the ranks and the regiments from 1 through 6, and assign to each
officer an ordered pair of integers in 1 through 6. The first component of the
ordered pair corresponds to the rank of the officer and the second component
his regiment. Euler’s problem then reduces to finding a pair of orthogonal
latin squares of order 6. Euler conjectured in 1782 that there exists no pair
of orthogonal latin squares of order n ≡ 2( mod 4). Euler himself verified
the conjecture for n = 2, while Tarry in 1900 verified it for n = 6 by a sys-
tematic case by case analysis. But the most significant result with regard to
the Euler conjecture came from Bose, Shrikande and Parker who disproved
the conjecture by establishing that if n ≡ 2( mod 4) and n > 6, then there
exists a pair of orthogonal latin squares of order n.
A set {L1 , . . . , Lt } of t latin squares of order n on S is called a set of mu-
tually orthogonal latin squares (MOLS) if Li and Lj are orthogonal whenever
i 6= j. It is easy to see [59] that the number t of MOLS of order n is bounded
by n − 1. Further, any set of n − 1 MOLS of order n is known to be equiva-
lent to the existence of a finite projective plane of order n. A long standing
conjecture is that if n is not a prime power, then there exists no complete
set of MOLS of order n.
We now show that if n is a prime power, there exists a set of n − 1 MOLS
of order n. (Equivalently, this implies that there exists a projective plane of
any prime power order, though we do not prove this here).
Theorem 5.36.1:
Let n = pk , where p is a prime and k is a positive integer. Then for n ≥ 3,
there exists a complete set of MOLS of order n.
Chapter 5 Algebraic Structures 315
Proof. By Theorem 5.34.2, we know that there exists a finite field GF (pk ) =
GF (n) = F , say. Denote the elements of F by a0 = 0, a1 = 1, a2 , . . . , an−1 .
Define the n − 1 matrices A1 , . . . , An−1 of order n by
where at aij = at ai + aj . The entries atij are all elements of the field F . We
claim that each At is a latin square. Suppose, for instance, two entries of
some i-th row of At , say atij and atil are equal. This implies that
at ai + aj = at ai + al ,
and hence aj = al . Consequently j = l. Thus all the entries of the i-th row
of At are distinct. For a similar reason, no two entries of the same column of
At are equal Hence At is a latin square.
We next claim that {A1 , . . . , An−1 } is a set of MOLS. Suppose 1 ≤ r <
u ≤ n − 1. Then Ar and Au are orthogonal. For suppose that
arij , auij = ari′ j ′ , aui′ j ′ .
ar ai + aj = ar ai′ + aj ′ ,
and au ai + aj = au ai′ + aj ′ .
Subtraction gives
(ar − au )ai = (ar − au )a′i
Graph Theory
6.1 Introduction
316
Chapter 6 Graph Theory 317
banks C and D of the Pregal river (later called the Pregolya). The people
of Königsberg wondered if it was possible to take a stroll across the seven
bridges, crossing each bridge exactly once and returning to the starting point.
Euler showed that it was not possible.
C (land)
Pregel river
A (island) B (island)
D (land)
Figure 6.1:
Figure 6.2:
Definition 6.2.1:
A graph G consists of a vertex (or point) set V (G) = {v1 , . . . , vn } and an edge
(or line) set E(G) = {e1 , . . . , em }, where each edge consists of an unordered
pair of vertices. If e is an edge of G, then it is represented by the unordered
pair {a, b} (denoted by ab when no confusion arises). We call a and b, the
endpoints of e. If an edge e = {u, v} ∈ E(G), then u and v are said to be
adjacent. A null graph on n vertices, denoted by Nn , has no edges. The
empty graph is denoted by φ and has vertex set φ (and therefore edge set φ).
A loop is an edge whose endpoints are the same. Parallel edges or multiple
edges are edges that have the same pair of endpoints. A simple graph is one
that has no loops and no multiple edges.
The order of a graph G, denoted by n(G) or simply n, is the number of
vertices in V (G). A graph of order 1 is called trivial. The size of a graph G,
denoted by m(G) or simply m, is the number of edges in E(G). If a graph
G has finite order and finite size, then G is said to be a finite graph.
Unless stated otherwise, we consider only simple finite graphs.
It is possible to assign a direction or orientation to each edge in a graph—
we then treat an edge e (with endpoints u and v) as an ordered pair (u, v)
−→ −→
or (v, u) often denoted by uv or vu.
Definition 6.2.2:
A directed graph or a digraph G consists of a vertex set V (G) = {v1 , . . . , vn }
Chapter 6 Graph Theory 320
2b
b
5b 6b
1
4b 5b b b
4 7
b b b b
1 G1 3 2 G2 3
V (G1 ) = {1,
2, 3, 4, 5} V (G2 ) = {1, 2, 3, 4, 5, 6, 7}
E(G1 ) = {1, 2}, {2, 3}, E(G2 ) = (1, 2), (2, 3), (1, 4), (3, 6),
{4, 5} (6, 5), (6, 7), (7, 6)
Figure 6.3:
and an arc set A(G) = {e1 , . . . , em } where each arc is an ordered pair of
vertices. A simple digraph is one in which each ordered pair of vertices
−→
occurs at most once as an arc. We indicate an arc as (u, v) or uv where u
−→
and v are the endpoints. Note that this is different from the arc vu with
same endpoints. If e = (u, v), then u is called the tail and v is called the
head of e.
1 ↔ a, 2 ↔ b, 3 ↔ d, 4↔d
Chapter 6 Graph Theory 321
6b 7
b
b b bb c
b
5
8 f b b g
2b b b b
3 e h
b b b b
1 a
G3 4 G4 d
V (G3 ) = {1, 2, 3, 4, 5, 6, 7, 8} V (G4 ) = {a, b, c, d, e, f, g, h}
E(G3 ) = {1, 4}, {4, 8}, {8, 5}, {5, 1}, E(G4 ) = {ab, bc, cd, da, ef, f g,
{1, 2}, {5, 6}, {8, 7}, {4, 3}, gh, he, ae, bf, cg, dh}
{2, 3}, {3, 7}, {7, 6}, {6, 2}
Figure 6.4:
5 ↔ e, 6 ↔ f, 7 ↔ g, 8↔h
Definition 6.2.3:
Two simple graphs G and H are isomorphic if there is a bijection φ : V (G) →
V (H) such that {u, v} ∈ E(G) if and only if {φ(u), φ(v)} ∈ E(H).
Exercise 6.2.4:
Show that the Petersen graph P (most commonly drawn as G1 below) is
isomorphic to the graphs G2 , G3 and G4 below:
Chapter 6 Graph Theory 322
b b
b b b b
b b
b
b b b
b b b b b b b
b b b b
b b b
b b
b b b b
b b b b b b b b
G1 G2 G3 G4
Figure 6.5:
Exercise 6.2.5:
Show that the Petersen graph is isomorphic to the following graph Q: The
vertex set V (Q) is the set of unordered pairs of numbers (i, j), i 6= j, 1 ≤
i, j ≤ 5. Two vertices {i, j} and {k, l} (i, j, k, l ∈ {1, 2, . . . , 5}) form an edge
if and only if {i, j} ∩ {k, l} = φ.
Example 6.2.6:
n
n(n−1)
Let V be a set cardinality n (vertices). From V we can obtain 2
= 2
unordered pairs (these can be treated as possible edges). Each subset of these
ordered pairs defines a simple graph and hence there are 2( 2 ) simple graphs
n
b b b b b b b b b b b b
b b b b b b b b b b b b
G1 G2 G3 G4 G5 G6
b b b b b b b b b b
b b b b b b b b b b
G11 G10 G9 G8 G7
Figure 6.6:
Definition 6.2.7:
The complement Ḡ of a simple graph G is the simple graph with vertex set
V (Ḡ) = V (G) and edge set E(Ḡ) defined thus: uv ∈ E(Ḡ) if and only if
uv ∈
/ E(G). That is, E(Ḡ) = {uv|uv ∈
/ E(G)}
Example 6.2.8:
The following graph G is isomorphic to its complement Ḡ:
Chapter 6 Graph Theory 324
1b b2 1b 2b
5b 6b 5b b6
b b b b
8 7 8 7
b b b b
4 3 4 3
G Ḡ
Figure 6.7:
Exercise 6.2.9:
Show that two graphs G and H are isomorphic if and only if Ḡ and H̄ are
isomorphic.
Definition 6.2.10:
A simple graph is called self-complementary if it is isomorphic to its own
complement.
Exercise 6.2.11:
The line-graph L(G) of a given graph G = V (G), E(G) is the simple graph
whose vertices are the edges of G with ef ∈ E L(G) if and only if two edges
Chapter 6 Graph Theory 325
e and f in G have a common endpoint. Prove that the Petersen graph is the
complement of the line-graph of K5 .
Exercise 6.2.12:
Prove that if G is a self-complementary graph with n vertices, then n is either
4t or 4t + 1, for some integer t (Hint: consider the number of edges in Kn )
b b b b b b
b b
b b b b b b
G G1 G2
Note that a complete graph may have many subgraphs that are not cliques
but every induced subgraphs of a complete graph is a clique.
The components of a graph G are its maximal connected subgraphs. A
component is nontrivial if it contains an edge.
An independent set of a graph G is a vertex subset S ⊆ V (G) such that
no two vertices of S are adjacent in G. It is easy to check that a clique of G
is an independent set of Ḡ and vice versa.
We next introduce a special class of graphs, called bipartite graphs. A
Chapter 6 Graph Theory 326
graph G is called bipartite if V (G) can be partitioned into two subsets X and
Y such that each edge of G has one endpoint in X and the other in Y . We
express this by writing G = G(X, Y ).
A complete bipartite graph G is a bipartite graph G(X, Y ) whose edge set
consists of all possible pairs of vertices having one endpoint in X and the
other in Y . If X has m vertices and Y has n vertices such a graph is denoted
by Km,n . Note that Km,n is isomorphic to Kn,m . It is easy to see that Km,n
has mn edges.
Example 6.2.13:
The graph of Fig 6.9.(a) is the 3-cube. It is a bipartite graph (though not
complete). Fig. 6.9.(b) is a redrawing of Fig. 6.9.(a) exhibiting the biparti-
tions X = {x1 , x2 , x3 , x4 } and Y = {y1 , y2 , y3 , y4 }. Graphs of Fig. 6.9.(c) are
isomorphic to the complete bipartite graph K3,3 .
x1 b
y1
b
y2 x2 x1 x2 x3 x4
b b b b b b
≃
b b b b b b
x3 y3 y1 y2 y3 y4
b b
y4 x4
(a) (b)
b b
b b b b b b
b b
b b b b b b
b b
Figure 6.9:
Chapter 6 Graph Theory 327
Exercise 6.2.14:
Let G be the graph whose vertex set is the set of binary strings of length
n (n ≥ 1) . A vertex x in G is adjacent to vertex y of G if and only if x and
y differ exactly in one position in their binary representation. Prove that G
is a bipartite graph.
Definition 6.2.15:
A walk of length k in a graph G is a non-null alternating sequence v0 e1 v1 e2 . . . ek vk
of vertices and edges of G (starting and ending with vertices) such that
ei = vi−1 vi , for all i. A trail is a walk in which no edge is repeated. A path
is a walk with no repeated vertex. A (u, v)- walk is one whose first vertex
is u and last vertex v (u and v are the end vertices of the walk). A walk or
trail is closed if it has length at least one and has its endvertices the same.
A cycle is a closed path. A cycle on n vertices is denoted by Cn (where the
vertices are unlabeled).
Definition 6.2.16:
A graph G is connected if it has a (u, v)-path for each pair u, v ∈ V (G).
Exercise 6.2.17:
Let G be a simple graph. Show that if G is not connected, then its comple-
Chapter 6 Graph Theory 328
ment Ḡ is connected.
Six people are at a party. Show that there are three people who all know
each other or there are three people who are mutually strangers.
Perhaps, the easiest way to solve the problem is using graph theory. Con-
sider the complete graph K6 . We associate the six people with the six vertices
of K6 . We color the edges joining two vertices black if the corresponding peo-
ple know each other. If two people do not know each other, we color the edge
joining the corresponding vertices grey. If there are three people who know
(don’t know) each other, then we should have a black (grey) triangle in K6 .
Given an assignment of colors to all edges of K6 , a subgraph H is called
monochromatic if all edges of H have the same color.
The party problem can now be posed as follows:
If we arbitrarily color the edges of K6 black or grey, then there must be
a monochromatic clique on three vertices.
Let u, v, w, x, y, z be the vertices of K6 . An arbitrary vertex, say u in
K6 has degree 5. So when we color the edges incident with u, we must use the
color black or grey at least three times. Without loss of generality, assume
that the three edges are colored black as shown in Fig. 6.10. Let these edges
be uv, ux and uw. If any one of the edges vw, vx or xw is now colored
black, we get the required black triangle. Hence we suppose all these edges
Chapter 6 Graph Theory 329
y b b z
b x
Figure 6.10:
b b b b b b b b b b b b b b b b b b b b
f (x) f g f (x) y′ y ′′ = f (x′ ) d
Figure 6.11:
Chapter 6 Graph Theory 330
(iii) two-way infinite paths with vertices alternating between X and Y . Such
a path is of the form
. . . , x, y, x′ , y ′ , . . . ,
where x, x′ , . . . ∈ X, and y, y ′ . . . ∈ Y .
x1 y1 x2 y2 , . . . , xn , yn x 1 ,
Definition 6.2.18:
Given a graph G, let v ∈ V (G). The degree d(v) or dG (v) of v is the number
of edges of G incident with v.
b b b b b b
Figure 6.12:
Chapter 6 Graph Theory 332
Exercise 6.2.19:
Consider any k-regular graph G for odd k. Prove that the number of edges
in G is a multiple of k.
Exercise 6.2.20:
Prove that every 5-regular graph contains a cycle of length at least six.
Proof. When the degrees of all the vertices are summed up, each edge is
counted twice. Hence the result.
Proof. Let A and B be respectively the set of odd and even vertices of G.
P
Then for each u ∈ B, d(u) is even and so d(u) is also even. By the
u∈B
Degree-Sum Formula
X X X
d(u) + d(w) = d(v) = 2m(G)
u∈B w∈A v∈V (G)
Chapter 6 Graph Theory 333
X X
This gives, d(w) = 2m(G) − d(v) = an even number
w∈A u∈B
Hence the result. This can be interpreted thus: the number of participants
at a birthday party each of who shake hands with an odd number of other
participants is always even.
Definition 6.3.1:
The adjacency list representation of a graph G = V (G), E(G) consists of
an array Adj of |V (G)|. For each vertex u of G, Adj[u] points to the list of
all vertices v that are adjacent to u.
−→
For a directed graph Adj[u] points to the list of all vertices v such that uv
is an arc in E(G). It is easy to see that the adjacency list representation of
a graph (directed or undirected) has the desirable property that the amount
of memory it requires is
uv is present in the graph, the only way is to search the list Adj[u] for v.
The process of determining the presence (or absence) of an edge uv is much
simpler in the adjacency-matrix (defined below) representation of a graph.
Definition 6.3.2:
To represent a graph G = V (G), E(G) , we first number the vertices of G
by 1, 2, . . . , |V | in some arbitrary manner. The adjacency matrix of G is then
the |V | × |V | matrix A = (aij ) when,
1 if ij ∈ E(G)
aij =
0 otherwise.
The above definition applies to directed graphs also where we specify the
elements aij as,
−→
1 if ij ∈ A(G)
aij =
0 otherwise.
Note that a graph may have many adjacency lists and adjacency matrices
because the numbering of the vertices is arbitrary. However, all the rep-
resentations yield graphs that are isomorphic. It is then possible to study
properties of graphs that do not dependent on the labels of the vertices.
Theorem 6.3.3:
Let G be a graph with n vertices v1 , . . . , vn . Let A be the adjacency matrix
of G with this labeling of vertices. Let Ak be the result of multiplication of
k (a positive integer) copies of A. Then the (i, j)th entry of Ak is the number
of different (vi , vj )-walks in G of length k.
Chapter 6 Graph Theory 335
The next theorem uses the above result to determine whether or not a
graph is connected.
Theorem 6.3.4:
Let G be a graph with n vertices v1 , . . . , vn and let A be the adjacency matrix
of G. Let B = (bij ) be the matrix given by,
B = A + A2 + . . . + An−1 .
Chapter 6 Graph Theory 336
(k)
Proof. Let aij denote the (i, j)th entry of Ak (k = 1. . . . , n − 1). We then
have,
(1) (2) (n−1)
bij = aij + aij + . . . + aij
(k)
By Theorem 6.3.3, aij denotes the number of distinct walks of length k from
vi to vj . Thus,
+ ···
In other words, bij is the number of different (vi , vj )-walks of length less than
n.
Assume that G is connected. Then for every pair i, j(i 6= j) there is a
path from vi to vj . Since G has only n vertices, any path is of length at most
n − 1. Hence there is a path of length less than n from vi to vj . This implies
that bij 6= 0.
Conversely, assume that bij 6= 0 for every pair i, j(i 6= j). Then from
the above discussion it follows that there is at least one walk of length less
than n, from vi to vj . This holds for every pair i, j(i 6= j) and therefore we
conclude that G is connected.
Exercise 6.3.5:
Let A be the adjacency matrix of a connected graph G with n vertices; is it
Chapter 6 Graph Theory 337
1b 3b
b b
2 4
Figure 6.13:
1 0 0 1 1 1
which represents the number 39 in the decimal system. Thus we can uniquely
represent a graph by a number.
ub vb
b
wb b
b b
G + uv
b b
b b b
b b
ub vb G−e
b b
e
b b b b b
w w′
b b b
b
G Gb − w b
bz
b b b
b b
b
G%b e
b b
Figure 6.14:
Definition 6.4.1:
In a connected graph G, a subset V ′ ⊆ V (G) of is a vertex cut of G if G − V ′
is disconnected. It is a k-vertex cut if |V ′ | = k. V ′ is called a separating set
of vertices of G. A vertex v of a connected graph G is a cut vertex of G if
{v} is a vertex cut of G.
Chapter 6 Graph Theory 340
Definition 6.4.2:
Let G be a nontrivial graph. Let S be a proper nonempty subset of V . Let
[S, S̄] denote the set of all edges of G having one endpoint in S and the other
in S̄. A set of edges of G of the form [S, S̄] is called an edge cut of G. An
edge e ∈ E(G) is a cut edge of G if {e} is an edge cut of G. An edge cut
of cardinality k is called a k-edge cut of G. If e is a cut edge of a connected
graph, then G − e has exactly two components.
Example 6.4.3:
Consider the graph in Fig. 6.15.
wb xb
b
vb
u
b b
y z
Figure 6.15:
{v} and {w, x} are vertex cuts. The edge subsets, wy, xy , uv and
{xz} are all edge cuts. Vertex v is a cut vertex. Edges uv and xz are cut
edges.
Theorem 6.4.4:
A vertex v of a connected graph G with at least three vertices is a cut vertex
of G if and only if there exist vertices u and w of G, distinct from v, such
that v is in every (u, w)-path in G.
Theorem 6.4.5:
In a connected graph G, an edge e = uv of is a cut edge of G if and only if
e does not belong to any cycle of G.
Proof. Let e = uv be a cut edge of G and let [S, S̄] = {e} be the partition
of V (G) defined by G − e so that u ∈ S and v ∈ S̄. If e belongs to a
cycle of G then [S, S̄] must contain at least one more edge contradicting that
{e} = [S, S̄]. Hence e cannot belong to a cycle.
Conversely assume that e is not a cut edge of G. Then G − e is connected
and hence there exists a (u, v)-path P in G − e. Then P together with the
edge e forms a cycle in G.
Theorem 6.4.6:
In a connected graph G, an edge e = uv is a cut edge if and only if there
exist vertices x and y such that e belongs to every (x, y)-path in G.
Conversely, assume that there exist vertices u and v satisfying the condi-
tion of the theorem. Then there exists no (x, y)-path in G − e and this means
that G − e is disconnected. Hence e is a cut edge of G
Exercise 6.4.7:
Prove or disprove: Let G be a simple connected graph with |V (G)| ≥ 3.
Then G has a cut edge if and only if it has a cut vertex.
Definition 6.4.8:
Let G be a nontrivial connected graph having at least a pair of non-adjacent
vertices. The minimum k for which there exists a k-vertex cut is called the
vertex connectivity or simply the connectivity of G; it is denoted by κ(G).
If G has no pair of nonadjacent vertices, that is if G is a complete graph of
order n, then κ(G) is defined to be n − 1.
Note that the removal of any set of n − 1 vertices of Kn results in a K1 .
A subset of vertices or edges of a connected graph G is said to disconnect
the graph if its deletion results in a disconnected graph.
Definition 6.4.9:
The edge connectivity of a connected graph G is the smallest k for which
there exists a k-edge cut (i.e., an edge cut containing k edges). The edge
connectivity of G is denoted by λ(G).
Chapter 6 Graph Theory 343
Definition 6.4.10:
A graph G is r-connected if κ(G) ≥ r. G is r-edge connected if λ(G) ≥ r.
The parameters κ(G), λ(G) and δ(G) are related by the following inequal-
ities.
Theorem 6.4.11:
For a connected graph G,
Theorem 6.4.12:
A graph G with at least three vertices is 2-connected if and only if there
exists a pair of internally disjoint paths between any pair of distinct vertices.
Proof. For any two distinct vertices u, v in G, assume that G has at least
two internally disjoint (u, v)-paths. Let w be any vertex of G. Then w is
not a cut vertex of G. If not, by Theorem 6.4.12, there exist vertices u and
v of G such that every (u, v)-path in G contains w. Hence w is not a cut
vertex of G and therefore G is 2-connected. Conversely, assume that G is 2-
connected. We apply induction on d(u, v) to prove that G has two internally-
disjoint (u, v)-paths. When d(u, v) = 1, the graph G − uv is connected, since
λ(G) ≥ κ(G) ≥ 2. Any (u, v)-path in G − uv is internally disjoint in G
from the (u, v)-path consisting of the edge uv. Thus u and v are connected
by two internally-disjoint paths in G. Now we apply induction on d(u, v).
Let d(u, v) = k > 1 and assume that G has internally-disjoint (x, y)-paths
whenever 1 ≤ d(x, y) < k. Let w be the vertex appearing before v on a
shortest (u, v)-path. Since d(u, w) = k − 1, by induction hypothesis, G has
two internally disjoint (u, w)-paths, say P and Q. (see Fig. 6.16).
Chapter 6 Graph Theory 345
P R
b z
b
u b bv
b
w
b b
Q
Figure 6.16:
Theorem 6.4.14:
For a graph G with |V (G)| ≥ 3, the following conditions characterize 2-
Chapter 6 Graph Theory 346
(b) For all u, v ∈ V (G), u 6= v, there are two internally disjoint (u, v)-paths.
Proof. Equivalence of (a) and (b) follows from Theorem 6.4.12. Any cycle
containing vertices u and v corresponds to a pair of internally disjoint (u, v)-
paths. Therefore (b) and (c) are equivalent.
Next we shall prove that (d) ⇒ (c). Let x, y ∈ V (G) be any two vertices
in G. We consider edges of the type ux and uy (since δ(G) > 1) or ux and
wy. By (d) these edges lie on a common cycle and hence x and y lie on a
common cycle (see Fig. 6.17).
xb
xb
ub
u b
b
b w b
y y
Figure 6.17:
neighborhood {x, y}. By the Expansion Lemma, the the resulting graph G′
is 2-connected and hence w, z lie on a common cycle C in G′ . Since w, z each
have degree 2, this cycle contains the paths u, w, v and x, z, y but not uv and
xy. We replace the paths u, w, v and x, z, y in C by the edges uv and xy to
obtain a desired cycle in G.
Definition 6.4.15:
A graph G is nonseparable if it is nontrivial, connected and has no cut
vertices. A block of a graph is a maximal nonseparable subgraph of G.
Example 6.4.16:
A graph G is shown in Fig 6.18.(a) and Fig 6.18.(b) shows its blocks B1 , B2 , B3
and B4 .
Chapter 6 Graph Theory 348
b e g
b
b
b
a b b
b
b
d f
b
b
c h
(a) Graph G
b g
b
b e
b
f b
a b b
d b b
f b
d f
b
c b
(b) Blocks of G h
Figure 6.18:
2 Each edge of G belongs to one of its blocks and hence G is the union of its
blocks.
3 Any two blocks of G have at most one vertex in common; such a vertex, if
it exists, is a cut vertex of G.
4 A vertex of G that is not a cut vertex belongs to exactly one of its blocks.
Definition 6.5.1:
A tree is a connected graph that has no cycle (that is, acyclic). A forest
is an acyclic graph; each component of a forest is a tree. Fig. 6.19 shows
an arbitrary tree. The graphs of Fig 6.20 show all (unlabeled) trees with at
most five vertices.
b
b b
b
b b b
b
b
b
b b b b
b b
b
b
b
b
b b
b
b
b b
b b b b
b b b b
b b b b b
b b b b b b b b b b b
b b
5 vertices
(i) G is a tree.
(iv) G is acyclic and the graph formed from G by “adding an edge” (that is,
a graph of the form G + e where e joins a pair of nonadjacent vertices
of G) contains a unique cycle.
Lemma 6.5.3:
Any tree with at least two vertices contains at least two leaves.
Lemma 6.5.4:
Let v be a leaf in a graph G. Then G is a tree if and only if G − v is a tree.
Proof of Theorem 6.5.2. We prove that each of the statements (ii) through
(v) is equivalent to statement (i). The proofs go by induction on the number
of vertices of G, using Lemma 6.5.4. For the induction basis, we observe that
all the statements (i) through (v) are valid if G contains a single vertex only.
We first show that (i) implies all of (ii) to (v). Let G be a tree with at
least two vertices and let v be a leaf and let v ′ be the vertex adjacent to v
in G. By the induction hypothesis, we assume that G − v satisfies (ii) to
(v). Now the validity of (ii), (iii) and (v) for G is obvious. For (iv), since G
is connected, any two vertices x, y ∈ V (G) is connected by a path P and if
xy ∈
/ E(G), then P + xy creates a cycle. Therefore (i) implies (iv) as well.
We now prove that each of the conditions (ii) to (v) implies (i). In (ii)
and (iii) we already assume connectedness. Also, a graph satisfying (ii) or
(iii) cannot contain a cycle: for (ii), this is because two vertices in a cycle are
connected by two distinct paths and for (iii), the reason is that by omitting
an edge in a cycle we obtain a connected graph. Thus (ii) implies (i) and
(iii) also implies (i).
To verify that (iv) implies (i), it suffices to check that G is connected. If
x, y ∈ V (G), then either xy ∈ E(G) or the graph G + xy contains a unique
cycle. Necessarily, as G is a cyclic, this cycle must contain the edge xy. Now
removal of the edge xy from this cycle give a path from x to y in G. Thus
G is connected. We finally prove that (v) implies (i). Let G be a connected
graph satisfying |V (G)| = |E(G)| + 1 ≥ 2. The sum of the degrees of all
vertices is 2|V (G)| − 2. This means that not all vertices can have degree 2
or more. Since all degrees are at least 1 (by connectedness) there exists a
vertex v of degree exactly 1, that is, a leaf of G. The graph G′ = G − v is
Chapter 6 Graph Theory 353
Definition 6.5.5:
Let G be a graph and let u, v be a pair of vertices connected in a path in G.
Then the distance from u to v, denoted by dG (u, v) or simply d(u, v), is the
least length (in terms of the number of edges) of a (u, v)-path in G. If G has
no (u, v)-path, we define d(u, v) = ∞.
Definition 6.5.6:
The diameter of a connected graph G (denoted by diam (G)) is defined as
the maximum of the distances between pairs of vertices of G. In symbols,
diam (G) = max d(u, v).
u,v∈V (G)
Since in a tree any two vertices are connected by a unique path, the
diameter of a tree T is the length of a longest path in T .
We next prove a theorem that gives a bound on the sum of the distances
of all the vertices of a tree T from a given vertex of T .
Theorem 6.5.7:
Let u be a vertex of a tree T with n vertices. Then,
X
n
d(u, v) ≤ .
2
v∈V (G)
b
b
b
b v1
T1 ub
b b
b
b v3
T3
b b
b
v2
b
b
b
T2 b
b
b
Figure 6.21:
if we now sum the formula for distances from u over all the components of
k
P
T − u, we obtain (as ni = n − 1),
i=1
X X ni
dT (u, v) ≤ (n − 1) +
i
2
v∈V (T )
Chapter 6 Graph Theory 355
k
P ni
n−1
P
We note that 2
≤ 2
since ni = n − 1, and the right hand side
i=1
counts the edges in Kn−1 and the left hand side counts the edges in a subgraph
of Kn−1 (a disjoint union of cliques). Hence we have,
X
n−1 n
dT (u, v) ≤ (n − 1) + =
2 2
v ∈ V (T )
Definition 6.6.1:
A spanning subgraph of a graph G is a subgraph H of G such that V (H) =
V (G). A spanning tree of G is a spanning subgraph of G which is a tree.
Exercise 6.6.2:
Prove that a graph G is connected if and only if it has a spanning tree.
1b 2b 1b 2b 1b 2b 1b 2b
b b b b b b b b
4 3 4 3 4 3 4 3
1b 2b 1b 2b 1b 2b 1b 2b
b b b b b b b b
4 3 4 3 4 3 4 3
1b 2b 1b 2b 1b 2b 1b 2b
b b b b b b b b
4 3 4 3 4 3 4 3
1b 2b 1b 2b 1b 2b 1b 2b
b b b b b b b b
4 3 4 3 4 3 4 3
Figure 6.22:
Exercise 6.6.5:
Let G be a graph with exactly one spanning tree. Prove that G is a tree.
Exercise 6.6.6:
If ∆ = k, then show that a graph G has a spanning tree with at least k
leaves.
The sum of the distances over all pairs of distinct vertices a graph G
is known as the Wiener Index of G, denoted by W (G). Thus W (G) =
P
u,v∈V (G) d(u, v).
W (Pn−1 ) W (Pn )
b b b b b b b
1 2 3 4 (n − 2) (n − 1) n
Figure 6.23:
Chapter 6 Graph Theory 358
Exercise 6.6.7:
Prove the following:
n
(i) W (Kn ) = 2
.
Algorithm SP-TREE
Proposition 6.6.8:
If algorithm SP-TREE outputs a graph T with n − 1 edges, then T is a
spanning tree of G. If T has k < (n − 1) edges, then G is a disconnected
graph with n − k components.
Proof. From the way the sets Ei are constructed the graph T contains no
cycle. If k = |E(T )| = n−1, then by Theorem 6.5.2 (v), T is a tree and hence
it is a spanning tree. If k < n − 1, then T is a disconnected graph whose
every component is a tree. It is easy to reason that it has n − k components.
We prove that the vertex sets of the components of the graph T coincide
with those of the components of the graph G. Assume the contrary and
let x and y be vertices lying in the same component of G but in distinct
components of T . Let C be the component of T containing the vertex x.
Consider some path,
(x = x0 , e1 , x1 , e2 , . . . , ek , xk = y)
b
b b
e b
b b
xb xi
b b
b y
C
Figure 6.24:
The design of electronic circuits using integrated chips often requires that
the pins of several components be at the same potential. This is achieved
by wiring them together. To interconnect a set of n pins, we can use an
arrangement of n − 1 wires, each connecting two points. Of the various
arrangements on a circuit board, the one that uses the least amount of wire
is desirable.
The above wiring problem can be modeled thus: we are given a connected
graph G = (V (G), E(G)), where V (G) corresponds to the set of pins and
E(G) corresponds to the possible interconnections. Associated with each
edge uv ∈ E(G), we have a “weight” w(u, v) specifying the cost (amount
of wire needed) to connect u and v. We then wish to find a spanning tree
T (V, E(T )) of G whose total weight,
X
W (E(T )) = w(u, v)
u v ∈E(T )
We now set, Vi = Vi−1 ∪ {yi } and Ei = Ei−1 ∪ {ei }. If no such edge exists,
the algorithm terminates. Let Et denote the set for which the algorithm has
stopped. The algorithm outputs the graph T = (V, Et ) as the MST.
Let us now consider the stage in the algorithm’s execution when the edge
ek+1 has been added to T . Let Tk = (Vk , Ek ) be the tree formed by the
addition of the edges e1 , . . . , ek . Then ek+1 = xy where x ∈ V (Tk ) and
/ V (Tk ). Consider the graph Tb + ek+1 . This graph contains some cycle
y ∈
C-such a cycle necessarily contains the edge ek+1 .
Chapter 6 Graph Theory 362
The cycle C consists of the edge ek+1 = xy, plus a path, say P connecting
the vertices x and y in the spanning tree Tb. At least one edge of the path P
has one vertex in the set Vk and the other vertex not in Vk . Let e be such an
b and
edge. Obviously e is different from ek+1 (see Fig. 6.25) and also e ∈ E
ek+1 ∈ b
/ E.
b
b b
b
b
b
x
b b
b
b
b
b ek+1
b
b b b
b b
y
e b b
b
b
b b
edges in P
Figure 6.25:
Both the e and ek+1 connect a vertex of Vk with a vertex outside Vk and
by the edge selection rule in the algorithm we get w(ek+1 ) ≤ w(e).
Now consider the graph T ′ = (T̂ + eK+1 ) − e. This graph has n − 1 edges
and is connected as can be easily seen. Hence it is a spanning tree. Now we
have w(E(T ′ )) = w(Ê) − w(e) + w(eK+1 ) ≤ w(Ê) and thus T ′ is an MST as
well, but with k(T ′ ) > k(T̂ ). This is a contradiction to the choice of T̂ and
hence T must be a MST.
Exercise 6.6.9:
Prove that algorithm KRUSKAL does produce an MST.
Definition 6.7.1:
Recall (see Section 6.2.1 on page 319) that an independent set of a graph
G is a subset S ⊆ V (G) such that no two vertices of S are adjacent in G.
S is a maximum independent set of G if G has no independent set S ′ with
|S ′ | > |S|. A maximal independent set of G is an independent set that is not
a proper subset of an independent set of G.
Chapter 6 Graph Theory 364
b
u q
b
v b
t r
b
b
s
Figure 6.26:
In Fig 6.26, {v} and {p, q, r, s, t, u} are both maximal independent sets.
The latter set is also a maximum independent set.
Definition 6.7.2:
A subset K ⊆ V (G) is called a covering of G if every edge of G is incident
with at least one vertex of K. A covering K is minimum if there is no
covering K ′ of G such that |K ′ | < |K|; it is minimal if there is no covering
K ′′ of G such that K ′′ is a proper subset of K.
ub
b
y bv
b
b
z
x w
b
Theorem 6.7.3:
In a graph G = V (G), E(G) , a subset S ⊆ V (G) is independent if V (G) \ S
is a covering of G.
Definition 6.7.4:
The number of vertices in a maximum independent set of G is called the
independence number of G and is denoted by α(G). The number of vertices
in a minimum covering of G is the covering number of G and is denoted by
β(G).
Corollary 6.7.5:
For a graph G of order n, α(G) + β(G) = n.
Exercise 6.7.6:
Prove that the smallest possible maximal independent set in a d-regular graph
is n/(d + 1).
Exercise 6.7.7:
Chapter 6 Graph Theory 366
Show that a graph G having all degrees at most d satisfies the following
inequality:
V (G)
α(G) ≥
d+1
Definition 6.8.1:
The chromatic number χ(G) of a graph G is the minimum number of in-
dependent subsets that partition the vertex set of G. Any such minimum
Chapter 6 Graph Theory 367
Definition 6.8.2:
A k-coloring of a graph G is a labeling f : V (G) → {1, . . . , k}. The labels
are interpreted as colors; all vertices with the same color form a color class.
A k-coloring f is proper if an edge uv is in E(G) iff f (u) 6= f (v). A graph
G is k-colorable if it has a proper k-coloring. We will call the labels “colors”
because their numerical value is not important. Note that χ(G) is then the
minimum number of colors needed for a proper coloring of G. We also say “G
is k-chromatic” to mean χ(G) = k. It is obvious that χ(Kn ) = n. Further
χ(G) = 2 if and only if G is bipartite having at least one edge. We can also
reason that χ(Cn ) = 2 if n is even and χ(Cn ) = 3 if n is odd.
Exercise 6.8.3:
Prove that χ(G) = 2 if and only if G is a bipartite graph with at least one
edge.
The Petersen graph, P has chromatic number 3. Fig. 6.28 shows a proper
3-coloring of P using three colors. Certainly, P is not 2-colorable, since it
contains an odd cycle.
Chapter 6 Graph Theory 368
1b
b3
3 b b b b2
1 3
b b
1 2
b b
2 1
Figure 6.28:
Since each color class of a graph G is an independent set, we can see that,
χ(G) ≥ |V (G)|/α(G), where α(G) is the independence number of G.
Definition 6.8.4:
A graph G is called critical if for every proper subgraph H of G, we have
χ(H) < χ(G). Also, G is called k-critical if it is k-chromatic and critical.
The above definition holds for any graph. When G is connected it is
equivalent to the condition that χ(G − e) < χ(G) for each edge e ∈ E(G);
but then this is equivalent to saying χ(G − e) = χ(G) − 1. If χ(G) = 1, then
G is either trivial or totally disconnected. Hence G is 1-critical if and only
if G is K1 . Also χ(G) = 2 implies that G is bipartite and has at least one
edge. Hence G is 2-critical if and only if G is K2 .
Exercise 6.8.5:
Prove that every critical graph is connected.
Exercise 6.8.6:
Show that if G is k-critical, then for any v ∈ V (G) and e ∈ E(G),
χ(G − v) = χ(G − e) = k − 1.
Chapter 6 Graph Theory 369
Theorem 6.8.7:
If G is k-critical, then δ(G) ≥ k − 1.
Corollary 6.8.8:
For any graph χ(G) ≤ 1 + ∆(G).
We can also show that the above result is implied by the “greedy” coloring
Chapter 6 Graph Theory 370
Algorithm GREEDY-COLORING
Exercise 6.8.10:
If χ(G) = k, then show that G contains at least k vertices each of degree at
least k − 1.
Chapter 7
Coding Theory
7.1 Introduction
Coding Theory has its origin in communication engineering. But with Shan-
non’s seminal paper of 1948 [?], it has been greatly influenced by math-
ematics with a variety of mathematical techniques to tackle its problems.
Algebraic coding theory uses a great deal of matrices, groups, rings, fields,
vector spaces, algebraic number theory and, not to speak of, algebraic geome-
try. In algebraic coding, each message is regarded as a block of symbols taken
from a finite alphabet. On most occasions, these are elements of Z2 = {0, 1}.
Each message is then a finite string of 0’s and 1’s. For instance, 00110111 is
a message. Usually, the messages get transmitted through a communication
channel. It is quite possible that such channels are subjected to noises, and
consequently, the messages changed. The purpose of an error correcting code
is to add redundancy symbols to the message, based on, of course, on some
rule so that the original message could be retrieved even though it is garbled.
Any communication channel looks as in Figure 7.1. The first box of the
371
Chapter 7 Coding Theory 372
q
0 0
p p
1 1
q
Figure 7.2:
original message could have been 0011. On the other hand, if the channel is
two-way, that is, it can detect errors so that the receiver knows the places
where the errors have occurred and also contains the provision for feedback,
then it can prove to be more effective in decoding the received message.
One of the simplest channels is the binary symmetric channel (BSC). This
channel has no memory and it simply transmits two symbols 0 and 1. It
has the property that the probability that a transmitted message is received
correctly is q while the probability that it is not is p = 1−q. This is pictorially
represented in Figure 7.2. Before considering an example of a BSC, we first
give the formal definition of a code.
Definition 7.2.1:
A code C of length n over a field F is a set of vectors in F n , the space of
ordered n-tuples over F . Any element of C is called a codeword of C.
Definition 7.3.1:
An [n, k]-linear code C over a finite field F is a k-dimensional subspace of
V n , the vector space of ordered n-tuples over F .
If F has q elements, that is, F = GF (q), the [n, k]-code will have q k
codewords. The codewords of C are all of length n as they are n-vectors over
Chapter 7 Coding Theory 375
F . k is the dimension of C.
Definition 7.3.2:
A nonlinear code of length n over a field F is just a subset of the vector space
F n over F .
C is a binary code if F = Z2 .
A linear code C is best represented by any one of its generator matrices.
Definition 7.3.3:
A generator matrix of a linear code C over F is a matrix whose row vectors
form a basis for C over F .
X = x1 R1 + x2 R2 + x3 R3 , (8.1)
Chapter 7 Coding Theory 376
x4 = x1 + x2 , and
x5 = x1 + x3 . (8.2)
In other words, the first redundancy coordinate of any codeword is the sum of
the first two information coordinates of that word while the next redundancy
coordinate is the sum of the first and third information coordinates.
Equations (8.2) are the parity-check equations of the code C1 . They can
be rewritten as
x1 + x2 − x4 = 0, and
x1 + x3 − x5 = 0. (8.3)
x1 + x2 + x4 = 0, and
x1 + x3 − x5 = 0. (8.4)
C1 = X ∈ Z2 5 : H1 X t = 0 = Null space of the matrix H1 ,
α1 u1 + · · · + αk uk , αi ∈ F for each i.
C is the row space of G over F . The null space of C is the space of vectors
X ∈ F n which are orthogonal to all the words of C. In other words, it is the
dual space C⊥ of C. As C is of dimension k over F , C⊥ is of dimension n − k
over F . Let {X1 , . . . , Xn−k } be a basis of C⊥ over F . If H is the matrix
whose row vectors are X1 , . . . , Xn−k , then H is a parity-check matrix of C.
It is an (n − k) by n matrix. Thus
C = row space of G
= null space of H
= X ∈ F n : HX t = 0 .
Theorem 7.3.4:
Let G = (Ik |A) be a generator matrix of a linear code C over F , where Ik is
the identity matrix of order k over F , and A is a k by (n − k) matrix over
F . Then a generator matrix of C⊥ is given by
H = −At |In−k
over F .
Proof. Each row of H is orthogonal to all the rows of G since (by block
multiplication, See(...),
h i
−A
GH t = [Ik |A] I = −A + A = 0.
n−k
Corollary 7.3.5:
G = [Ik |A] is a generator matrix of a code C of length n iff H = [−At |In−k ]
is a parity-check matrix of C.
of a Code
Definition 7.4.1:
The weight W (v) of a codeword v of a code C is the number of nonzero
coordinates in v. The minimum weight of C is the least of the weights of its
nonzero codewords. The weight of the zero vector of C is naturally zero.
Example 7.4.2:
As an example, consider the binary code C2 with generator matrix
h i
10110
G2 = 0 1 1 0 1 = [I2 |A] .
Definition 7.4.3:
Let X, Y ∈ F n . The distance d(X, Y ), also called the Hamming distance
between X and Y , is defined to be the number of places in which X and Y
differ.
Chapter 7 Coding Theory 380
d(X, Y ) = wt (X − Y ). (8.5)
Theorem 7.4.4:
The minimum distance of a linear code C is the minimum weight of a nonzero
codeword of C.
Thus for the linear code C2 of Example 7.4.2, the minimum distance is 3.
The function d(X, Y ) defined in Equation (8.5) does indeed define a dis-
tance function (that is, a metric) on C. That is to say, it has the following
three properties:
For all X, Y, Z in C,
Hamming codes are binary linear codes. They can be defined either by their
generator matrices or by their parity-check matrices. We prefer the latter.
Let us start by defining the [7, 4]-Hamming code H3 . The seven column
vectors of its parity-check matrix H are the binary representations of the
Chapter 7 Coding Theory 381
numbers 1 to 7 written in such a way that the last three of its column
vectors form I3 , the identity matrix of order 3. Thus
" #
1110100
H= 1101010 .
1011001
We now write the coset decomposition of Z72 with respect to the subspace H3 .
(Recall that H3 is a subgroup of the additive group Z72 ). As Z72 has 27 vectors,
and H3 has 24 codewords, the number of cosets of H3 in Z2 is 27 /24 = 23 .
(See ....). Each coset is of the form X + H3 = {X + v : v ∈ H3 }. Any two
cosets are either identical or disjoint. The vector X is a representative of the
cosetX + H3 . The zero vector is a representative of the coset H3 . If X and
Y are each of weight 1, the coset X + H3 6= Y + H3 , since X − Y is of weight
Chapter 7 Coding Theory 382
(0000000) (1000111) (0100110) (0010101) (0001011) (1100001) (1010010) (1001100) (0110011) (0101101) (0011010) (0111000) (1011001) (1101010) (1110100) (1111111)
(1000000) (0000111) (1100110) (1010101) (1001011) (0100001) (0010010) (0001100) (1110011) (1101101) (1011010) (1111000) (0011001) (0101010) (0110100) (0111111)
(0100000) (1100111) (0000110) (0110101) (0101011) (1000001) (1110010) (1101100) (0010011) (0001101) (0111010) (0011000) (1111001) (1001010) (1010100) (1011111)
(0010000) (1010111) (0110110) (0000101) (0011011) (1110001) (1000010) (1011100) (0100011) (0111101) (0001010) (0101000) (1001001) (1111010) (1100100) (1101111)
(0001000) (1001111) (0101110) (0011101) (0000011) (1101001) (1011010) (1000100) (0111011) (0100101) (0010010) (0110000) (1010001) (1100010) (1111100) (1110111)
(0000100) (1000011) (0100010) (0010001) (0001111) (1100101) (1010110) (1001000) (0110111) (0101001) (0011110) (0111100) (1011101) (1101110) (1110000) (1111011)
(0000010) (1000101) (0100100) (0010111) (0001001) (1100011) (1010000) (1001110) (0110001) (0101111) (0011000) (0111010) (1011011) (1101000) (1110110) (1111101)
Figure 7.3:
384
Chapter 7 Coding Theory 385
· · · 0 ··· 0 ··· 0 · · · · · · 1 ··· 0 · · ·
.. .. ..
. . . · · · 0 ··· 1 · · ·
.. ..
H:
0 1 1
H:
. .
· · · 1 ··· 0 ··· 1 · · · 0 0
··· 1 ··· 1 ··· 0 ··· ··· 1 ··· 1 ···
(a) (b)
Figure 7.4:
As before, let F n denote the vector space of all ordered n-tuples over F .
Recall (Section 7.4) that F n is a metric space with the Hamming distance
between vectors of F n as the metric.
Definition 7.7.1:
In F n , the sphere with centre X and radius r is the set
S(X, r) = {Y ∈ F n : d(X, Y ) ≤ r} ⊂ F n .
Definition 7.7.2:
An r-error-correcting linear code C is perfect if the spheres of radius r with
the words of C as centres are pairwise disjoint and their union is F n .
Theorem 7.7.3:
The Hamming code Hm is a single-error-correcting perfect code.
m −1−m)
Proof. Hm is code of dimension 2m − 1 − m over Z2 and hence, has 2(2
words. Now if v is any codeword of Hm , S(v, 1) contains v (which is at
distance zero from v) and the 2m − 1 codewords got from v (which is of
length 2m −1) by altering each position once at a time. Thus S(v, 1) contains
1 + (2m − 1) = 2m words of Hm . Consequently, the union of the spheres
m −1−m
S(v, 1) as v varies over Hm contains 22 · 2m vectors. But this number
Chapter 7 Coding Theory 387
m −1
is = 22 = the number of vectors in F n , where n = 2m − 1.
Out next theorem shows that the minimum distance d of a linear code
stands as a good measure of the code.
Theorem 7.7.4:
d−1
If C is a linear code of minimum distance d, then C can correct t = ⌊ ⌋
2
or fewer errors.
z
u v
Figure 7.5:
≤ t + t = 2t ≤ d − 1 < d,
Let C be a binary linear code of length n. We can extend this code by adding
an overall parity-check at the end. This means, we add a zero at the end of
each word of even weight in C and add 1 at the end of every word of odd
weight. This gives an extended code C′ of length n + 1.
For looking at some of the properties of C′ , we need a lemma.
Lemma 7.8.1:
Let w denote the weight function of a binary code C. Then
Proof. Let X and Y have common 1’s in the i1 , i2 , . . . , ip -th positions so that
X ⋆ Y = p. Let X have 1’s in the i1 , . . . , ip and j1 , . . . , jq -th positions and Y
in the i1 , . . . , ip and l1 , . . . , ir -th positions. Then w(X) = p + q, w(Y ) = p + r
and w(X + Y ) = q + r. The proof is now clear.
Let C be an [n, k]-linear code over GF (q) = F . The standard array decoding
scheme requires storage of q n vectors of F n and also comparisons of a received
vector with the coset leaders. The number of such comparisons is at most
q n−k , the number of distinct cosets in the standard array. Hence any method
that makes a sizeable reduction in storage and the number of comparisons
is to be welcomed. One such method is given by the syndrome-decoding
scheme.
Definition 7.9.1:
The syndrome of a vector Y ∈ F n with respect to a linear [n, k]-code over F
with parity-check matrix H is the vector HY t .
Theorem 7.9.2:
Two vectors of F n belong to the same coset in the standard array decompo-
sition of a linear code C iff they have the same syndrome.
Theorem 7.9.2 shows that the syndromes of all the vectors of F n are
determined by the syndromes of the coset leaders of the standard array of C.
In case C is an [n, k]-binary linear code, there are 2n−k cosets and therefore
the number of distinct syndromes is 2n−k . Hence in contrast to standard-
array decoding, it is enough to store 2n−k vectors (instead of 2n vectors) in
the syndrome decoding. For instance, if C is an [100, 30]-binary linear code,
it is enough to store the 270 syndromes instead of the 2100 vectors in Z100
2 , a
7.10 Exercises
1. Find all the codewords of the binary code with generator matrix [ 11 00 10 11 11 ].
Find a parity-check matrix of the code. Write down the parity-check
equations.
3. Decode the received vector (1100 011) in H3 using (i) the standard array
decoding, and (ii) syndrome decoding.
4. How many vectors of Z72 are there in S(u, 3), where u ∈ Z72 ?
Chapter 7 Coding Theory 391
7. Show that there exists a set of eight binary vectors of length 6 such
that the distance between any two of them is at least 3.
9. Show that the function d(X, Y ) defined in Section 7.4 is indeed a metric.
d
10. Show that a linear code of minimum distance d can detect at most ⌊ ⌋
2
errors.
Chapter 8
Cryptography
8.1 Introduction
392
Chapter 8 Cryptography 393
metic and the transformations are then carried out on this set of numbers.
An enciphering transformation f converts a plaintext message unit P (given
by its corresponding number)into a number that represents the correspond-
ing ciphertext message unit C while its inverse transformation, namely, the
deciphering transformation just does the opposite by taking C to P . We
assume that there is a 1–1 correspondence between the set of all plaintext
units P to the set C of all ciphertext units. Hence each plaintext unit gives
rise to a unique ciphertext unit and vice versa. This can be represented
symbolically by
f f −1
P −→ ξ −→ C ,
A B C D E F G H I J K L M
0 1 2 3 4 5 6 7 8 9 10 11 12
N O P Q R S T U V W X Y Z
13 14 15 16 17 18 19 20 21 22 23 24 25
Figure 8.1:
Figure 8.1 gives the 1–1 correspondence between the characters A to Z and
the numbers 1 to 25. For example, the word “OKAY” corresponds to the
number sequence “(14) (10) (0) (24)” and this gets transformed, by eqn. (8.1),
to “(17)(13)(3)(1)” and so the corresponding ciphertext is “RNDB”. The
deciphering transformation applied to “RNDB” then gives back the message
“OKAY”.
where a and b are in the ring Z27 , a 6= 0 and (a, 27) = 1. In eqn. (8.3), P
and C denote a pair of corresponding plaintext and ciphertext units. The
Extended Euclidean Algorithm [?] ensures that as (a, 27) = 1, a has a unique
inverse a−1(mod 27) so that
C ≡ 4P + 2 (mod 272 ).
Further as (4, 272 ) = 1, 4 has a unique inverse (mod 272 ); in fact 4−1 = −182
as 4 · (−182 ) ≡ 1(mod 272 ). This when substituted in congruence (8.4) gives
In the Caesar cryptosystem and the affine cryptosystem, the keys are known
to the sender and the receiver in advance. That is to say that whatever
information does the sender has with regard to his encryption, it is shared
by the receiver. For this reason, these cryptosystems are called private key
cryptosystems.
Suppose an intruder I (that is a person other than the sender A and the
receiver B) who has no knowledge of the private keys wants to hack the
Chapter 8 Cryptography 397
message, that is, decipher the message stealthily. We may suppose that the
type of cryptosystem used by A and B of the system including the unit
length, though not the keys, are known to I. Such an information may get
leaked out over a passage of time or may be obtained even by spying. How
does I go about hacking it? He does it by a method known as frequency
analysis.
Assume for a moment that the message units are of length 1. Look at a
long string of the ciphertext and find out the most-repeated character, the
next most-repeated character and so on. Suppose, for the sake of precision,
they are U, V, X, . . . Now in the English language, the most common char-
acters of the alphabet of 27 letters consisting of the English characters A
to Z and “space” are known to be “space” and E. Then “space” and E of
the plaintext correspond to U and V of the ciphertext respectively. If the
cryptosystem used is the affine system given by equation,
C = aP + b (mod 27),
Subtraction yields
22a ≡ −1 (mod 27) (8.5)
a unique nonnegative integer less that 272 . Suppose the frequency analysis
of the ciphertext reveals that the most commonly occurring ordered pairs
are “CB” and “DX” in their decreasing orders of their frequencies. The
decryption transformation is of the form
Here a and b are the enciphering keys and a′ , b′ are the deciphering keys.
Now it is known that in English language, the most frequently occurring
order pairs, in their decreasing orders of their frequencies, are “E(space)”
and “S(space)”. Symbolically,
“S(space)” −→ CX.
Subtraction gives
As (50, 729) = 1, this congruence has a unique solution by the Extended Eu-
clidean Algorithm [?]. In fact, a′ =?????????? and therefore b′ =??????????
Chapter 8 Cryptography 399
Thus the deciphering keys a′ and b′ have been determined and the cryptosys-
tem has been hacked.
In our case, the gcd(50, 729) happened to be 1 and hence we had no prob-
lem in determining the deciphering key. If not, we have to try all the possible
solutions for a′ and take the plaintext that is meaningful. Instead, we can also
continue with our frequency analysis and compare the next most-repeated
ordered pairs in the plaintext and ciphertext and get a third congruence and
try for a solution in conjunction with one or both of the earlier congruences.
If these also fail, we may have to adopt adhoc techniques to determine a′ and
b′ .
Assume once again that the message units are ordered pairs in the same
alphabet of size 27 of Section 8.2. We can use 2 by 2 matrices over the ring
Z27 to set up a private key cryptosystem in this case. In fact if A is any
2 by 2 matrix with entries from Z27 , and (X, Y ) is any plaintext unit, we
X
encipher it as B = A , where B is again a 2 by 1 matrix and therefore
Y
X′
a ciphertext unit of length 2. If B = , we have the equations
′
Y
X′ X
= A
Y′ Y
X X′
and = A−1 . (8.8)
Y Y′
Chapter 8 Cryptography 400
The first equation of (8.8) gives the encryption while the second gives the
decryption. Notice that A−1 must be taken in Z27 . For A−1 to exist, we must
have (det A, 27) = 1. If this were not the case, we may have to try once
again adhoc methods.
21
As an example, take A = . Then det A = 2, and (det A, 27) =
43
(2, 27) = 1. Hence 2 (mod 27) exists and 2−1 = 14 ∈ Z27 . This gives
7
3 −1 42 −14 15 13
A−1 = 14 = = over Z27 . (8.9)
−4 2 −56 28 25 1
Suppose, for instance, we want to encipher “HEAD” using the above matrix
7
transformation. We proceed as follows: ‘HE” corresponds to the vector ,
4
0
and “AD” to the vector . Hence the enciphering transformation gives
3
the corresponding ciphertext as
70 7 0
A = A , A (mod 27)
43 4 3
21 7 21 0
= , (mod 27)
43 4 43 3
18 3
= , (mod 27)
40 9
18 3 S D
= , = ,
13 9 N J
exactly the same manner by taking A−1 in Z27 . This gives the plaintext
18 3 15 13
A−1 , A−1 , where A−1 = , as given by (8.9)
13 9 25 1
8.4 Exercises
17 5
1. Find the inverse of A = in Z27 .
8 7
12 3
2. Find the inverse of A = in Z29 .
5 17
x − y = 4 (mod 26)
7x − 4y = 10 (mod 26).
(space) C ? Y C F ! Q, T W I U M H Q V.
Suppose we know by some means that the last four letters of the plain-
text are our adversary’s signature “MIKE”. Determine the full plain-
text.
In this cipher, the plaintext is in the English alphabet. The key consists of
an ordered set of letters for some fixed positive integer d. The plaintext is
divided into message units of length d. The ciphertext is obtained by adding
the key to each message unit using modulo 26 addition.
For example, let d = 3 and the key be XYZ. If the message is “ABAN-
DON”, the ciphertext is obtained by taking the numerical equivalence of the
plaintext, namely,
010 (13) 3 (14) 14 (13),
and the adding modulo 26 numerical equivalence of “XYZ”, namely, (23) (24) (25)
of the key. This yields
(23) (25) (25) (36) (27) (39) (37) (37) (mod 26)
Chapter 8 Cryptography 403
= “X Z Z K B N L L′′
as the ciphertext.
C ≡ M + K (mod 26)
Notwithstanding the fact that the key K is as long as the message M , the
system has its own drawbacks.
(ii) The long private keyK must be communicated to the receiver in ad-
vance.
All cryptosystems described so far are private key cryptosystems. This means
that some one who has enough information to encipher messages has enough
information to decipher messages as well. As a result, in private key cryp-
tography, any two persons in a group who want to communicate messages in
a secret way must have exchanged keys in a safe way (for instance, through
a trusted courier).
In 1976, the face of cryptography got altered radically with the invention
of public key cryptography by Diffie and Hellman [?]. In this cryptosystem,
the encryption can be done by any one. But the decryption can be done only
by the intended recipient who alone is in possession of a secret key.
At the heart of this cryptography is the concept of a “one-way function”.
Roughly speaking, a one-way function is a 1-1 function f which is such that
whenever k is given, it is possible to compute f (k) “rapidly” while it is
“extremely difficult” to compute the inverse of f in a “reasonable” amount of
time. There is no way of asserting that such and such a function is a one-way
function since the computations depend on the technology of the day—the
hardware and the software. So what passes on for a one-way function today
may fail to be a one-way function a few years hence.
As an example of a one-way function, consider two large primes p and q
each having at least 500 digits. Then it is “easy” to compute their product
n = pq. However, given n, there is no efficient factoring algorithm as on date
that would give p and q in a reasonable amount of time. The same problem
of forming the product pq with p and q having 100 digits had passed on for
a one-way function in the 1980’s but is no longer so today.
Chapter 8 Cryptography 405
(PA ◦ SA ) M = M = (SA ◦ PA ) M.
Transmission of Messages
Digital Signature
We now describe two public key cryptosystems. The first is RSA, after
their inventors, Rivest, Shamir and Adleman. In fact, Diffie and Hellman,
though they invented public key cryptography in 1976, did not give the pro-
cedure to implement it. Only Rivest Shamir and Adleman did it in 1978,
two years later.
Description of RSA
2. Each A chooses a small positive integer e, 1 < e < φ(n), such that
(e, φ(n)) = 1, where (the Euler function) φ(n) = φ(pq) = φ(p)φ(q) =
(p − 1)(q − 1). (e is odd as φ(n) is even).
ed ≡ 1 (mod φ(n)).
4. A (Alice) gives the ordered pair (n, e) as her public key and keeps d as
her private (secret) key.
d
S(M ′ ) ≡ M ′ (mod n) (8.11)
Thus both P and S (of A) act on the ring Zn . Before we establish the
correctness of RSA, we observe that d (which is computed using the Extended
Euclidean algorithm) can be computed in O(log3 n) time. Further powers
M e and M ′ d modulo n in eqns. (8.10) and (8.11) can also be computed in
O(log3 n) time [?]. Thus all computations in RSA can be done in polynomial
time.
Proof. We have
M ed ≡ M (mod n).
Chapter 8 Cryptography 408
ed ≡ 1 (mod φ(n)).
M ed = M 1+k(p−1)(q−1)
= M · M k(p−1)(q−1)
M p−1 ≡ 1 (mod p)
and therefore,
M ed ≡ M (mod p).
As p and q are distinct primes, the congruences (8.12) and (8.13) imply that
M ed ≡ M (mod pq)
≡ M (mod n)
Chapter 8 Cryptography 409
The above description shows that if Bob wants to send the message M to
Alice, he will send it as M e(mod n) using the public key of Alice. To decipher
the message, Alice will raise this number to the power d and get M ed ≡ M
(mod n), the original message sent by Bob.
The security of RSA rests on the supposition that none other than Alice
can determine the private key d of Alice. A person can compute d if he/she
knows φ(n) = (p − 1)(q − 1) = n − (p + q) + 1, that is to say, if he/she knows
the sum p + q. For this, he should know the factors p and q of n. Thus, in
essence, the security of RSA is based on the assumption that factoring a large
number n that is a product of two distinct primes is “difficult”. However to
quote Koeblitz [?], “no one can say with certainty that breaking RSA requires
factoring n. In fact, there is even some indirect evidence that breaking RSA
cryptosystem might not be quite as hard as factoring n. RSA is the public
key cryptosystem that has had by far the most commercial success. But,
increasingly, it is being challenged by elliptic curve cryptography”.
We have seen that RSA is based on the premise that factoring a very larger
integer which is a product of two “large” primes p and q is “difficult” com-
pared to forming their product pq. In other words, given p and q, finding
their product is a one-way function. ElGamal public key cryptosystem uses
a different one-way function, namely, a function that computes the power of
an element of a large finite group G. In other words, given G, g ∈ G, g 6= e,
and a positive integer a, ElGamal cryptosystem is based on the assumption
that computation of g a = b ∈ G is “easy” while given b ∈ G and g ∈ G, it is
Chapter 8 Cryptography 410
Definition 8.6.2:
Let G be a finite group and b ∈ G. If y ∈ G, then the discrete logarithm of y
with respect to base b is any non-negative integer x less than o(G), the order
of G, such that bx = y, and we write logb y = x.
As per the definition, logb y may or may not exist. However, if we take
G = Fq∗ , the group of nonzero elements of a finite field Fq of q elements and
g, a generator of the cyclic group Fq∗ (See [?]), then for any y ∈ Fq∗ , the
discrete logarithm logg y exists.
Example 8.6.3:
∗ ∗
5 is a generator of F17 . In F17 , the discrete logarithm of 12 with respect to
∗
base 5 is 10. In symbols: log5 12 = 10. In fact, in F17 ,
h5i = 51 = 5, 52 = 8, 6, 13, 14, 2, 10, 58 = −1, 12, 9, 11, 4, 3, 15, 7, 516 = 1
The ElGamal system works in the following way: All the users in the system
agree to work in an already chosen large finite field Fq . A generator g of
Fq∗ is fixed once and for all. Each message unit is then converted into a
number Fq . For instance, if the alphabet is the set of English characters and
if each message unit is of length 3, then the message unit BCD will have the
numerical equivalent 262 · 1 + 261 · 2 + 3 ( mod q). It is clear that in order that
these numerical equivalents of the message units are all distinct, q should be
Chapter 8 Cryptography 411
quite large. In our case, q ≥ 263 . Now each user A in the system randomly
chooses an integer a = aA , 0 < a < q − 1, and keeps it as his or her secret
key. A declares g a ∈ Fq as his public key.
If B wants to send the message unit M to A, he chooses a random positive
integer k, k < q − 1, and sends the ordered pair
g k , M g ak (8.14)
We have seen that the most commonly applied public key cryptosystem,
namely, the RSA is built up on very large prime numbers (numbers having,
say, 500 digits and more). So there arises the natural question: Given a large
positive integer, how do we know that it is a prime or not. A ‘primality test’
is a test that tells if a given number is a prime or not.
Chapter 8 Cryptography 412
For a positive real number x, let π(x) denote the number of primes less than
or equal to x. The Prime Number Theorem states that π(x) is asymptotic
to x/ log x; in symbols, π(x) ≈ x/ log x. Here the logarithm is with respect
π(n) 1
to base e. Consequently, π(n) ≈ n/ log n, or, equivalently, n
≈ log n
. In
other words, in order to find a 100-digit prime one has to examine roughly
loge 10100 ≈ 230 randomly chosen 100-digit numbers for primality. (This
figure may drop down by half if we omit even numbers).
Chapter 8 Cryptography 413
Fermat’s Little Theorem (FLT) states that if n is prime, then for each a,
1 ≤ a ≤ n − 1,
an−1 ≡ 1 (mod n). (8.15)
Note that for any given a, an−1 ( mod n) can be computed in polynomial time
using the repeated squaring method [?]. However, the converse of Fermat’s
Little Theorem is not true. This is because of the presence of Carmichal
numbers. A Carmichal number is a composite number n satisfying (8.15) for
each a prime to n. They are sparse but are infinitely many (??????????).
The first few Carmichal numbers are 561, 1105, 1729.
Since we are interested in checking if a given large number n is prime
or not, n is certainly odd and hence (2, n) = 1. Consequently, if 2n−1 6≡ 1
(mod n), we can conclude with certainty, in view of FLT, that n is composite.
However, if 2n−1 ≡ 1 ( mod n), n may be a prime or not. If it is not a prime,
then it is a pseudoprime with respect to base b.
Definition 8.7.1:
n is called a pseudoprime to base a, where (a, n) = 1, if
(i) n is composite, and
(ii) an−1 ≡ 1 ( mod n)
But then there is a chance that n is not a prime. How often does this
happen? For n < 10, 000, there are only 22 pseudoprimes to base 2. They
are 341, 561, 645, 1105, . . . . Using better estimates due to Carlo Pomerance
(See [?]), we can conclude that the chance of a randomly chosen 50 digit
(resp. 100-digit) number satisfies (8.15) but fails to be a prime is < 10−6
(resp.< 10−13 ).
More generally, if (a, n) = 1, 1 < a < n, the pseudoprime test with
reference to base a checks if an−1 6≡ 1 ( mod n). If true, a is composite; if
not, a may be a prime.
If we choose each a with (a, n) = 1, then we may have to choose φ(n) base
numbers a in the worst case. Instead, the Miller-Rabin Test works as follows:
(i) It tries several randomly chosen base values a instead of just one a.
(ii) While computing each modular exponentiation an−1 ( mod n), it stops
as soon as it notices a nontrivial square root of 1 ( mod n) and outputs
composite.
We now proceed to present the pseudocode for the Miller–Rabin test. The
code uses an auxiliary procedure WITNESS such that WITNESS (a, n) is
TRUE iff a is a “witness” to the compositeness of n. We now present and
justify the construction of “WITNESS”.
WITNESS (a, n)
Chapter 8 Cryptography 415
1 let (bk , bk−1 , . . . , b0 ) be the binary representation of n − 1
2 d ←− 1
3 for i ←− k downto 0
4 do x ←− d
5 d ←− (d, d) ( mod n)
6 if d = 1 and x 6= n − 1
7 then return TRUE
8 if bi = 1
9 then d ←− (d, d) ( mod n)
10 if d 6= 1
11 then return TRUE
12 return FALSE
MILLER-RABIN (a, n)
1 for j ←− 1 to s
2 do a ←− RANDOM (1, n − 1)
3 if WITNESS (a, n)
4 then return COMPOSITE ⊲ Definitely
5 return PRIME ⊲ Almost surely
Chapter 8 Cryptography 417
8.8.1 Introduction
nology, Kanpur, India made a sensational revelation that they have found a
polynomial time algorithm for primality testing. Actually, their algorithm
works in Õ (log7.5 n) time (Recall from ?????????? that Õ (f (n)) stands for
O f (n)· (polynomial in log f (n)) . It is based on a generalization of Fermat’s
Little Theorem to polynomial rings over finite fields. Notably, the correct-
ness proof of their algorithm requires only simple tools of algebra. In the
following section, we present the details of AKS algorithm in ???
The AKS algorithm is based on the following identity for prime numbers
which is a generalization of Fermat’s Little Theorem.
Lemma 8.8.1:
Let a ∈ Z, n ∈ N, n ≥ 2, and (a, n) = 1. Then n is prime if and only if
Proof. We have
n−1
X
n n n
(X + a) = X + X n−1 ai + an .
n=1
i
n
If n is prime, each term i
, 1 ≤ i ≤ n − 1, is divisible by n. Further, as
(a, n) = 1, by Fermat’s Little Theorem, an ≡ a ( mod n). This establishes
(8.16).
If n is composite, n has a prime factor q < n. Let q k ||n (that is, q k |n but
q k+1 6 | n). Now consider the term nq X n−q aq in the expansion of (X + a)n .
We have
n n(n − 1) · · · (n − q + 1)
= .
q 1 · 2···q
Chapter 8 Cryptography 419
n
n
Then q k 6 | q
. For if q k q
, as q k ||n , n(n − 1) · · · (n − q + 1) must be
divisible by q, a contradiction. Hence q k , and therefore n does not divide the
term nq X n−q aq . This shows that (X + a)n − (X n + a) is not identically zero
over Zn .
The above identity suggests a simple test for primality: given input n,
choose an a and test whether the congruence (8.16) is satisfied, However
this takes time Ω(n) because we need to evaluate n coefficients in the LHS
in the worst case. A simple way to reduce the number of coefficients is to
evaluate both sides of (8.16) modulo a polynomial of the form X r − 1 for an
appropriately chosen small r. In other words, test if the following equation
is satisfied:
(X + a)n = X n + a (mod X r − 1, n) (8.17)
From Lemma 8.8.1, it is immediate that all primes n satisfy eqn. (8.17) for
all values of a and r. The problem now is that some composites n may also
satisfy the eqn. (8.17) for a few values of a and r (and indeed they do).
However, we can almost restore the characterization: we show that for an
appropriately chosen r if the eqn. (8.17) is satisfied for several a’s, then n
must be a prime power. It turns out that the number of such a’s and the
appropriate r are both bounded by a polynomial in log n, and this yields a
deterministic polynomial time algorithm for testing primality.
Fp denotes the finite field with p elements, where p is a prime. Recall that if p
is prime and h(x) is a polynomial of degree d irreducible over Fp , then Fp [X]/
(h(X)) is a finite field of order pd . We will use the notation, f (X) = g(X)
Chapter 8 Cryptography 420
(mod h(X), n) to represent the equation f (X) = g(X) in the ring Zn [X]/
(h(X)), that is, if the coefficients of f (X), g(X) and h(X) are reduced modulo
n, then h(X) divides f (X) − g(X).
As mentioned earlier, for any function f (n) of n, Õ f (n) stands for
O f (n) · poly in log f (n) . For example,
Õ(logk n) = O logk n · poly log logk n
= O logk n · poly log log n
= O logk+ǫ n for any ǫ > 0.
Lemma 8.8.2:
Let lcm (m) denote the lcm of the first m natural numbers. Then for m ≥ 7,
lcm (m) ≥ 2m .
Chapter 8 Cryptography 421
Theorem 8.8.3:
The AKS algorithm returns PRIME iff n is prime.
Lemma 8.8.4:
If n is prime, then the AKS algorithm returns PRIME.
Proof. If n is prime, we have to show that AKS will not return COMPOSITE
in steps 1, 3 and 5. Certainly, the algorithm will not return COMPOSITE
in step 1. Also, if n is prime, there exists no a such that 1 < (a, n) < n, so
that the algorithm will not return COMPOSITE in step 3. By Lemma 8.8.1,
the for loop in step 5 cannot return COMPOSITE. Hence the algorithm will
identify n as PRIME either in step 4 or in step 6.
We now consider the steps when the algorithm returns PRIME, namely,
steps 4 and 6. Suppose the algorithm returns PRIME in step 4. Then n
must be prime. If n were composite, n = n1 n2 , where 1 < n1 , n2 < n.
Chapter 8 Cryptography 422
Lemma 8.8.5:
There exists an r ≤ 16 log5 n + 1 such that Or (n) > 4 log2 n.
Proof. Let r1 , . . . , rt be all the numbers such that Ori (n) ≤ 4 log2 n for each
i and therefore ri divides αi = (nOri (n) − 1) for each i. Now for each i, αi
divides the product
⌊4 logr n⌋
Y 4 4 5
P = (ni − 1) < n16 log n
= (2log n )16 log n
= 216 log n .
i=1
t
Q 2
(Note that we have used the fact that (ni − 1) < nt , the proof of which
i=1
follows readily by induction on t). As ri divides αi and αi divides P for
each i, 1 ≤ i ≤ t, the lcm of the ri ’s also divides P . Hence (lcm of the
5
ri ’s)< 216 log n . However, by Lemma 8.8.2,
log5 n ⌉
lcm 1, 2, . . . , ⌈16 log5 n⌉ ≥ 2⌈16 2 .
Hence there must exist a number r in 1, 2, . . . , ⌈16 log5 n⌉ , that is, r ≤
16 log5 n + 1 such that Or (n) > 4 log2 n.
Definition 8.8.6:
For polynomial f (X) and number m ∈ N, m is said to be introspective for
f (X) if
[f (X)]m = f (X m ) (mod X r − 1, p).
It is clear from eqns. (8.18) and (8.19) that both n and p are introspective
for X + a, 1 ≤ a ≤ l. Our next lemma shows that introspective numbers are
closed under multiplication.
Chapter 8 Cryptography 424
Lemma 8.8.7:
If m and m′ are introspective numbers for f (X), then so is mm′ .
f (X m ) = (f (X))m (mod X r − 1, p)
′ ′
and hence [f (X m )]m = f (X)mm (mod X r − 1, p). (8.20)
Next we show that for a given number m, the set of polynomials for which
m is introspective is closed under multiplication.
Lemma 8.8.8:
If m is introspective for both f (X) and g(X), then it is also introspective for
the product f (X)g(X).
Eqns. (8.18) and (8.19) together imply that both n and p are introspective
for (X + a). Hence by Lemmas 8.8.7 and 8.8.8, every number in the set
I = {ni pj : i, j ≥ 0} is introspective for every polynomial in the set P =
l
Q ea
(X + a) : ea ≥ 0 . We now define two groups based on the sets I
a=1
and P that will play a crucial role in the proof.
The first group consists of the set G of all residues of numbers in I modulo
r. Since both n and p are prime to r, so is any number in I. Hence G ⊂ Zn∗ ,
the multiplicative group of residues mod r that are relatively prime to r. It
is easy to check that G is a group. (Only thing that requires a verification is
that ni pj has a multiplicative inverse in G. Since nOr (n) ≡ 1 ( mod r), there
′ ′
exists i′ , 0 ≤ i′ < Or (n) such that ni = ni . Hence inverse of ni (= ni ) is
n(Or (n)−i) . A similar argument applies for p as pOr (n) = 1. (p being a prime
divisor of r, (p, r) = 1). Let |G| = the order of the group G = t (say). As G
is generated by n and p modulo r and since Or (n) > 4 log2 n, t > 4 log2 n.
To define the second group, we need some basic facts about cyclotomic
polynomials over finite fields. Let Qr (X) be the r-th cyclotomic polynomial
over the field Fp (?????[?]). Then Qr (X) divides X r − 1 and factors into
irreducible factors of the same degree d = Or (p). Let h(X) be one such
irreducible factor of degree d. Then F = Fp [X]/(h(X)) is a field. The
second group that we want to consider is the group generated by X + 1, X +
2, . . . , X + l in the multiplicative group F ∗ of nonzero elements of the field
F . Hence it consists of simply the residues of polynomials in P modulo h(X)
and p. Denote this group by G.
We claim that the order of G is exponential in either t = |G| or l.
Lemma 8.8.9:
Chapter 8 Cryptography 426
|G| ≥ min 2l − 1, 2t .
Proof. First note that h(X) | Qr (X) and Qr (X) | (X r − 1). Hence X may be
taken as a primitive r-th root of unity in F = Fp [X]/(h(X)).
We claim that (*) if f (X) and g(X) are polynomials of degree less than
t and if f (X) 6= g(X) in P , then their images in F (got by reducing the
coefficients modulo p and then taking modulo h(X) ) are distinct. To see
this, assume that f (X) = g(X) in the field F (that is, the images of f (X)
and g(X) in the field F are the same). Let m ∈ I. Recall that every number
of I is introspective with respect to every polynomial in P . Hence m is
introspective with respect to both f (X) and g(X). This means that
Finally we show that if n is not a prime power, then |G| is bounded above
by an exponential function of t = |G|.
Lemma 8.8.10:
√
If n is not a prime power, |G| ≤ 21 n2 t .
two distinct numbers in Iˆ which become equal when reduced modulo r. Let
them be m1 , m2 with m1 > m2 . So we have (since r divides (m1 − m2 ) ),
X m1 = X m2 (mod X r − 1) (8.22)
[f (X)]m1 = f (X m1 ) (mod X r − 1, p)
= f (X m2 ) (mod X r − 1, p) by (8.22)
= [f (X)]m2 (mod X r − 1, p)
Q1 (Y ) = Y m1 − Y m2 over F .
Thus there are at least |G| distinct roots in F . Naturally, |G| ≤ degree of
Q1 (Y ). Now the degree of Q1 (Y )
= m1 (as m1 > m2 )
Lemma 8.8.9 gives a lower bound for |G| while Lemma 8.8.10 gives an
upper bound for |G|. These bounds enable us to prove the correctness of the
algorithm.
Chapter 9
Finite Automata
9.1 Introduction
430
Chapter 9 Finite Automata 431
and elegant. In 1970, with the development of the UNIX operating system,
practical applications of finite automata have appeared—in lexical analysis
(lex), in text searching (grep) and in Unix-like utilities (awk). Extensions
to finite automata have been proposed in the literature. For example, the
Büchi automata is one that accepts infinite input sequences.
We begin the notion of languages which is freely used in later discussions.
We then introduce regular expressions. Subsequent sections deal with finite
automata and their properties.
9.2 Languages
then,
{a}⋆ = {ǫ, a aa, aaa, aaaa, . . .}
Consider the alphabet A comprising both upper and lower case of the 26
letters of the English alphabet as well as the punctuation marks, blank and
question mark. That is,
A = {a, b, . . . , A, B, . . . , 6 b, ?}
The sentence,
This6 b is6 b a6 b sentence.
is a sequence in A⋆ . Similarly, the sentence,
sihT6 b si6 b ynnuf.
is a sequence in A⋆ .
The objective in defining a language is to selectively pick certain sequences
of Σ⋆ , that make sense in some way or that satisfy some property.
Definition 9.2.1:
Let Σ be a finite set, the alphabet. A language over Σ is a subset of the set
Σ⋆ .
Note that Σ⋆ , φ and Σ denote some language! It is not difficult to reason
that since Σ is finite, Σ⋆ is a countably infinite set (the basic idea is thus: for
each k ≥ 0, all strings of length k are enumerated before all strings of length
k + 1; strings of length exactly k are enumerated lexicographically, once we
fix some ordering of the symbols in Σ).
Chapter 9 Finite Automata 433
Since languages are sets, they can be combined by union, intersection and
difference. Given a language A we also define the complement in Σ⋆ as
Ā =∼ A = {x ∈ Σ⋆ | x 6∈ A} .
In other words, ∼ A is Σ⋆ \ A.
If A and B are languages over Σ, their concatenation is a language A · B
or simply AB defined as,
AB = {xy | x ∈ A and y ∈ B} .
For example, {a, b}n is the set of strings over {a, b} of length n.
The closure or the Kleene star or the asterate A⋆ of a language A is the
set of all strings obtained by concatenating zero or more strings from A. This
is exactly equivalent to taking the union of all finite powers of A. That is
[
A⋆ = An
n≥0
A+ = AA⋆
Exercise 9.2.2:
The language L is defined over the alphabet {a, b} recursively as follows:
i) ǫ ∈ L
Chapter 9 Finite Automata 434
Show that L is the set of all elements in {a, b}⋆ except those containing the
substring aab.
(ii) how to recognize whether a given string belongs to the given language?
Answers to these questions comprise much of the subject matter of the study
of automata and formal languages. Models of computation, such as finite
automata, pushdown automata, Turing Machines or λ-calculus evolved be-
fore modern computers came into existence. Parallely and independently, the
formal notion of grammar and language (Chomsky hierarchy, a hierarchy of
language classification) were developed. The equivalence between languages
and automata is now well-understood.
Our concern here is restricted to what are called as regular languages
described precisely by regular expressions (answering the question (i) above).
We are also concerned with the corresponding model of computation called
finite automata or finite state machines (which will answer the question (ii)
above).
Chapter 9 Finite Automata 435
Definition 9.3.1:
Regular expressions (and the corresponding languages) over Σ are exactly
those expressions that can be constructed from the following rules:
Example 9.3.2:
We write a + b⋆ c to mean (a + ((b⋆ )c)), a regular expression over {a, b, c}.
Example 9.3.3:
The regular expression (a + b)⋆ cannot be written as a + b⋆ because they
denote different languages.
Example 9.3.4:
To reason about the language represented by the regular expression (a+b)⋆ a,
we note that (a + b)⋆ denotes strings of any length (including 0) formed by
taking a or b. This is then concatenated with the symbol a. Therefore the
language consists of all possible strings from {a, b} ending in a.
Chapter 9 Finite Automata 437
Exercise 9.3.5:
Let Σ = {a, b}. Reason that
1 0 0
q0 q1
1
The automaton M1
Figure 9.1:
0 1 1
q0 q1
0
The automaton M2
Figure 9.2:
We can easily reason that M2 accepts the empty string ǫ and those that
end in 0.
We now consider the following automaton M3 which “solves” a “practical
Chapter 9 Finite Automata 439
0, 3, 6, 9
q0
1, 4, 7 1, 4, 7
0, 3, 6, 9
2, 5, 8 s 2, 5, 8
1, 4, 7 2, 5, 8
q1 1, 4, 7 q2
0, 3, 6, 9 2, 5, 8 0, 3, 6, 9
Automaton M3 −Divisibility by 3 tester
Figure 9.3:
Exercise 9.4.1:
By trial and error, design a finite automaton that precisely recognizes the
following language A:
A = {ω | ω is a string that has an equal number of occurrences of 01 and 10
as substrings}.
Definition 9.4.2:
A (deterministic) finite automaton M is a 5-tuple
M = (Q, Σ, δ, s, F ) ,
where
Q is a finite set of states
Σ is a finite alphabet
δ: Q × Σ → Q is a transition function
s ∈ Q is the start state
F ⊆ Q is the set of accept (or final) states.
Note that δ, the transition function defines the rules for “moving” through
the automaton as depicted pictorially. It can equivalently be described by a
table. If M is in state q and sees input a, it moves to state δ(q, a). No move
on ǫ is allowed; also δ(q, a) is uniquely specified.
Example 9.4.3:
Chapter 9 Finite Automata 441
M4 = (Q, Σ, δ, s, F )
δ: 0 1
a a b
b a c
c a c
δ̂ : Q × Σ⋆ → Q
Thus the state δ̂(q, x) is the state the automaton M will end up in, when
started in state q, fed the input x and allowed transitions according to δ.
Note that δ̂ and δ agree on strings of length one:
Definition 9.4.4:
A language A is said to be regular if A = L(M ) for some DFA M .
M1 = (Q1 , Σ, δ1 , s1 , F1 )
M2 = (Q2 , Σ, δ2 , s2 , F2 )
To show that A∩B is regular, we have to show that there exists an automaton
M3 such that L(M3 ) = A ∩ B. We claim that M3 can be constructed as given
below:
Q3 = Q1 × Q2 = {(p, q) | p ∈ Q1 and q ∈ Q2 }
F3 = F1 × F2 = {(p, q) | p ∈ F1 and q ∈ F2 }
s3 = (s1 , s2 )
δ3 (p, q), a = δ1 (p, a), δ2 (q, a)
Lemma 9.4.5:
We first show that for all x ∈ Σ⋆ ,
δ̂3 ((p, q), x) = δ̂1 (p, x), δ̂2 (q, x) .
Now assume that the lemma holds for x ∈ Σ⋆ . We will show that it holds
for xa also where a ∈ Σ.
δ̂3 (p, q), xa = δ3 δ̂3 ((p, q), x) , a definition of δ̂3
= δ3 δ̂1 (p, x), δ̂2 (q, x) , a induction hypothesis
= δ̂1 δ̂1 (p, x), a , δ̂2 δ̂2 (q, x), a definition of δ3
= δ̂1 (p, xa)) , δ̂2 (q, xa) definition of δ̂1 and δ̂2
For all x ∈ Σ⋆ ,
x ∈ L(M3 )
A ∪ B = ∼ (∼ A∩ ∼ B)
9.5.1 Nondeterminism
accepting states when the end of x is reached. Since there are many choices
for going to the next state, there may be many paths through the NFA in
response to the input x—some may lead to accept states and some may lead
to reject states. The NFA is said to accept x if at least one computation
path on x starting from the start state leads to an accept state. It should be
noted that in the NFA model itself there is no mechanism to determine as to
which transition to make in response to a next symbol from the input. We
illustrate these ideas with an example. Consider the following automaton
N1 :
0, 1
q0 q1 q2 q3
1 0, 1 0, 1
The automaton N1
Figure 9.4:
In the automaton N1 , from the state q0 there are two transitions on the
symbol 1. So N1 is indeed nondeterministic. It is easy to reason that N1
accepts all strings over {0, 1}⋆ that contain a 1 in the third position from
the end.
On an input, say, 01110100, a computation can be such that we always
stay in state q0 . But the computation where we stay in state q0 till we read
the first five symbols and then move to states q1 , q2 and q3 on reading the
next three symbols accepts the string 01110100. Therefore N1 accepts the
string 01110100.
Chapter 9 Finite Automata 446
The definition is very similar to that of a DFA except that we need to describe
the new way of making transitions. In an NFA the input to the transition
function is a state plus an input symbol or the empty string; the transition
is to a set of possible next (legal) states.
Let P(Q) be the power set of Q. Let Σǫ denote Σ ∪ {ǫ}, for any alphabet
Σ. The formal definition is as follows:
Definition 9.5.1:
A nondeterministic finite automaton is a 5-tuple (Q, Σ, δ, s, F ), where
Q is a finite set of states
Σ is a finite alphabet
δ: Q × Σǫ → P(Q) is the transition function
s ∈ Q is the start state and
F ⊆ Q is the set of accept states.
We can now formally state the notion of computation for an NFA. Let
N = (Q, Σ, δ, s, F ) be an NFA and let w be a string over the alphabet Σ.
Then we say that N accepts w if we can write w as w1 w2 · · · wk , where each
wi , 1 ≤ i ≤ k, is in Σǫ and there exists a sequence of states q0 , q1 , . . . , qk ,
0 ≤ i ≤ k such that,
(i) q0 = s
(iii) qk ∈ F
Condition (i) states that the machine starts in its start state. Condition (ii)
Chapter 9 Finite Automata 447
states that on reading the next symbol in any current state the next state is
any one of the allowed legal states. Condition (iii) states that the machine
exhausts its input to end up in any one of the final states.
and NFA
Definition 9.6.1:
Two automata M and N are said to be equivalent if L(M ) = L(N ).
Figure 9.1: Computing ǫ-CLOSURE
We now give the subset construction procedure. That is, we are given an
NFA N ; we are required to construct a DFA D equivalent to N .
Initially let ǫ-CLOSURE (s) be a state (the start state) of D, where s is
the start state of N . We assume that initially each state of D is “unmarked”.
We now execute the following procedure.
begin
while(there is an unmarked state q = (r1 , r2 , . . . rn ) of D) do
begin mark q;
for (each input symbol a ∈ Σ) do
begin
let T be the set of states to which there is
a transition on a from some state ri in q;
x = ǫ-CLOSURE (T );
if (x has not yet been added to the set of
states of D)
then make x an unmarked state of D;
Add a transition from q to x labelled a
if not already present;
end
end
end
Figure 9.2:
Chapter 9 Finite Automata 450
Theorem 9.6.2:
Every NFA has an equivalent DFA.
Exercise 9.6.3:
The following example illustrates the fact that given an NFA with ǫ-transitions,
we can get another NFA with no ǫ-transitions.
q2
q2 1
0
0 0 0 1 0
0 0 1
q0 q1 q3 q0 0 q1 0 q3
ǫ ǫ
0
Figure 9.5:
Exercise 9.6.4:
Argue that both the following NFAs accept the language (01⋆ + 0⋆ 1).
Chapter 9 Finite Automata 451
0 q0 0 q1
q0 q1
ǫ
1
1 0 1
0
p0 1 p1
1
p0 p1
0
Figure 9.6:
Exercise 9.6.5:
Consider the following NFA which accepts all strings ending in 01 (interpreted
as a binary integer, the NFA accepts all integers of the form 4x + 1, x = any
integer).
0, 1
0 1
q0 q1 q2
Figure 9.7:
Show that by applying the subset construction procedure we get the following
equivalent DFA.
1 0
Figure 9.8:
Chapter 9 Finite Automata 452
Exercise 9.6.6:
Consider the following NFA which accepts all strings of 0s and 1s such that
the nth symbol from the end is 1.
0, 1
1 0, 1 0, 1 0, 1 0, 1
q0 q1 q2 b b b qn−1 qn
Figure 9.9:
Argue that the above NFA is a bad case for subset construction.
We first show that the class of regular languages is closed under the con-
catenation operation. Given two regular languages A and B, let N1 and N2
be two NFAs such that L(N1 ) = A and L(N2 ) = B. From N1 and N2 we
construct a new NFA N , to recognize AB, as suggested by the following
figure.
Chapter 9 Finite Automata 453
s1 s2
N1 N2
ǫ
s
Figure 9.10:
The key idea is that, we make the start state s1 of N1 as the start state
s of N . From each accept state of N1 we make an ǫ-transition to the start
state s2 of N2 . The accept states of N are the accept states of N2 only.
Let N1 = (Q1 , Σ, δ1 , s1 , F1 )
N2 = (Q2 , Σ, δ2 , s2 , F2 )
N = (Q, Σ, δ, s1 , F2 )
where
(i) Q = Q1 ∪ Q2
(ii) We define δ so that for any q ∈ Q and a ∈ Σǫ
Chapter 9 Finite Automata 454
δ1 (q, a) if q ∈ Q1 and q 6∈ F1
δ1 (q, a) if q ∈ F1 and a 6∈ ǫ
δ(q, a) =
δ1 (q, a) ∪ {s2 } if q ∈ F1 and a = ǫ
δ2 (q, a) if q ∈ Q2
Finally we show that the class of regular languages is closed under the
star operation.
Given a regular language A, we wish to prove that the language A⋆ is also
regular. Let N be an NFA such that L(N ) = A. We modify N , as suggested
by the following figure, to build an NFA N ′ :
ǫ
N N′
s s′ ǫ
ǫ
Figure 9.11:
The NFA N ′ accepts any input that can be broken into several pieces
and each piece is accepted by N . In addition, N ′ also accepts ǫ which is a
member of A⋆ .
Let N = (Q, Σ, δ, s, F ) be an NFA that recognizes A. We construct N ′
to recognize A⋆ as follows:
N ′ = (Q′ , Σ, δ ′ , s′ , F ′ )
Chapter 9 Finite Automata 455
Q′ = Q ∪ {s′ }
F ′ = F ∪ {s′ }
Theorem 9.7.1:
The class of languages accepted by finite automaton is closed under intersec-
tion, complementation, union, concatenation and Kleene star.
Lemma 9.8.1:
If a language A is described by a regular expression R, then it is regular;
that is, there is an NFA to recognize A.
Figure 9.12:
Figure 9.13:
Figure 9.14:
i) R1 + R2
ii) R1 · R2
Chapter 9 Finite Automata 457
iii) R1⋆
For this case we simply construct the equivalent NFA following the techniques
used in proving closure properties (union, catenation and Kleene star) of
regular languages.
Example 9.8.2:
Consider the regular expression (ab+aab)⋆ . We proceed to construct an NFA
to recognize the corresponding language:
a b
Figure 9.15:
Step 2 We use the above NFAs and construct NFAs for ab and aab.
a ǫ b
a ǫ a ǫ b
Figure 9.16:
Step 3 Using the above NFAs for ab and aab, we construct an NFA for
ab + aab.
Chapter 9 Finite Automata 458
a ǫ b
ǫ
ǫ a ǫ a ǫ b
Figure 9.17:
Step 4 From the above NFA for ab + aab, we build the required NFA for
(ab + aab)⋆ .
a ǫ b
ǫ ǫ
ǫ
ǫ
ǫ a ǫ a ǫ b
Figure 9.18:
Lemma 9.8.3:
If a language is regular (that is, it is recognized by a finite automaton), then
it is described by a regular expression.
More explicitly,
{a ∈ Σ | δ(p, q) = q} if p 6= q
L(p, q, 0) =
{a ∈ Σ | δ(p, a) = p} ∪ {ǫ} if p = q.
case, in general, it goes from state p to the state k + 1 (for the first time),
then possibly loops from state k + 1 back to itself (zero or more times), and
then from the state k + 1 to the state q. This means, we can write x as yzw,
where y corresponds to the path from state p to the first visit of state k + 1,
z corresponds to the looping in state k + 1, and w corresponds to the path
from state k + 1 to state q. We note that in each of the two parts y and w
and in each of the looping making z, the path does not go through any state
higher than k. Therefore,
The expression on the right hand side above can be described by a regular
expression if the individual languages in it can be described by regular ex-
pressions (this is so, by induction hypothesis). Therefore L(p, q, k + 1) can
be described by a regular expression.
Theorem 9.8.4:
A language is regular if and only if some regular expression describes it.
Consider the regular expression (a + b)⋆ abb. If we construct the NFA for this
and apply subset construction algorithm, we will get the following DFA:
a a
A B C b
a
b a a b
D E
b
Figure 9.19:
The above DFA has five states. The following DFA with only four states
also accepts the language described by (a + b)⋆ abb:
b b
b
a b
A B C D
a
a a
Figure 9.20:
The above example suggests that a given DFA can possibly be simplified
to give an equivalent DFA with a lesser number of states.
We now give an algorithm that gives a general method of reducing the
Chapter 9 Finite Automata 463
M = (Q, Σ, δ, s, F )
We assume that from every state there is a transition on every input (if q is
a state not conforming to this then introduce a new “dead state” d from q
to d on the inputs that are not already present; also add transitions from d
to d on all inputs).
We say that a string w distinguishes a state q1 from a state q2 if δ̂(q1 , w) ∈
F and δ̂(q2 , w) 6∈ F or vice versa.
The minimization procedure works on M by finding all groups of states
that can be distinguished by some input string. Those groups of states that
cannot be distinguished are then merged to form a single state for the entire
group. The algorithm works by keeping a partition of Q such that each group
of states consists of states which have not yet been distinguished from one
another and such that any pair of states chosen from different groups have
been found distinguishable by some input.
Initially the two groups are F and Q \ F . The fundamental step is to take
a group of states, say A = {q1 , q2 , . . . , qk } and some input symbol a ∈ Σ and
consider δ(qi , a) for every qi ∈ A. If these transitions are to states that fall
into two or more different groups of the current partition, then we must split
A into subsets so that the transitions from the subsets of A are all confined
to a single group of the current partition. For example, let δ(q1 , a) = t1 and
δ(q2 , a) = t2 and let t1 and t2 be in different groups. Then we must split A
into at least two subsets so that one subset contains q1 and another contains
q2 . Note that t1 and t2 are distinguished by some string w and so q1 and q2
are distinguished by the string aw.
Chapter 9 Finite Automata 464
The algorithm repeats the process of splitting groups until no more groups
need to be split. The following fact can be formally proved: If there exists a
string w such that δ̂(q1 , w) ∈ F and δ̂(q2 , w) 6∈ F or vice versa then q1 and
q2 cannot be in the same group; if no such w exists then q1 and q2 can be in
the same group.
Q Q
1. We construct a partition of Q. Initially consists of F and Q \ F
Q Q Q
only. We next refine to new , a new partition using procedure Refine-
Q Q
given in Fig. ??. That is, new consists of groups of , each split into one
Q Q Q Q
or more subgroups. If new 6= , we replace by new and repeat the
Q Q Q
procedure Refine- . If new = , then we terminate the process of refining
Q
.
Q
Let G1 , G2 , . . . , Gk be the final groups of .
Q
procedure Refine-
Q
begin for (each group G of ) do
begin partition G into subgroups such that two states
q1 and q2 of G are in the same subgroup iff for
Q
all a ∈ Σ, δ(q1 , a) and (q2 , a) are in the same group of ;
/* in the worst case, a state will be in a
subgroup by itself */
Q
place all subgroups so formed in new
Chapter 9 Finite Automata 465
end
end
Figure 9.3:
Q
2. For each Gi in we pick a representative, an arbitrary state in Gi . The
representatives will be the states of the DFA M ′ .
Let qi be the representative of Gi and for a ∈ Σ let δ(qi , a) ∈ Gj . Note
Gj can be same as Gi . Let qj be the representative of Gj . Then in M ′ we
add the transition from qi to qj on a. Let the initial state of M ′ be the
representative of the group containing the initial state of s of M . Also let
the final states of M ′ be the representatives which are in F .
3. If M ′ has a dead state d, then remove d from M ′ . Also remove any
state not reachable from the initial state. Any transition from other states
to d become undefined.
Example 9.9.1:
The following DFA recognizes the language described by the regular expres-
sion (0 + 1)⋆ 10.
Chapter 9 Finite Automata 466
4
0
1
2 1 0
0 5
1 0
1
6 1
1 0
3 0
1
7
1
Figure 9.21:
Applying the above algorithm gives the final partition of states as {1, 2, 4},
{5, 3, 7} and {6}. The minimized DFA is given below:
0 1
1 0
{1,2,4} {5,3,7} {6}
1
0
Figure 9.22:
Section 9.11 will answer the question of the uniqueness of the minimal
DFA.
Chapter 9 Finite Automata 467
D1 = (Q1 , Σ, δ1 , s1 , F1 ) and
D2 = (Q2 , Σ, δ2 , s2 , F2 )
The DFAs D1 and D2 are said to be isomorphic if there is a 1–1 onto mapping
f : Q1 → Q2 such that,
(i) f (s1 ) = s2
Conditions (i), (ii) and (iii) imply that D1 and D2 are essentially the same
automaton up to renaming of states. Therefore they accept the same input
set. We can show that the minimal state DFA corresponding to set it accepts
is unique up to isomorphism. This can be done by a beautiful correspondence
between a DFA with input alphabet Σ and certain equivalence relations on
Σ⋆ .
M = (Q, Σ, δ, s, F )
Chapter 9 Finite Automata 468
Hence xa ≡M ya.
(b) ≡M refines R:
For any x, y ∈ Σ⋆ , if x ≡M y then (x ∈ R ⇔ y ∈ R). By definition
x ≡M y means δ̂(s, x) = δ̂(s, y), which is either an accept state or a
reject state. Therefore either both x and y are accepted or both x and
y are rejected. Stated another way, every equivalence class induced by
≡M has all its elements in R or none of its elements in R.
Definition 9.10.1:
An equivalence relation ≡M on Σ⋆ is a Myhill-Nerode relation for R, a regular
set, if it satisfies the properties (a), (b) and (c) above; that is, ≡M is a right
congruence of finite index, refining R.
The interesting fact about the definition above is that it characterises ex-
actly the relations on Σ⋆ that are ≡M for some automaton M . That is, we
can construct M from ≡M using only the fact that ≡M is a Myhill-Nerode
relation.
[x] = {y | y ≡ x}
Note that there are indefinitely many strings, but there are only finitely many
equivalence classes by property (c) above.
We now define the DFA M≡ = (Q, Σ, δ, s, F ) as follows:
Q = {[x] | x ∈ Σ⋆ }
Chapter 9 Finite Automata 470
s = [ǫ]
F = {[x] | x ∈ R}
Lemma 9.10.2:
δ̂ ([x], y) = [xy]
Proof. The proof is by induction on [y]. For the basis, we note that,
= [xyz]
Theorem 9.10.3:
L(M≡ ) = R
Now, x ∈ L(M≡ )
⇔ x ∈ F, by definition of F
We will now show that the constructions (i) and (ii) are inverses up to iso-
morphism of automata.
Lemma 9.10.4:
Let ≡ be a Myhill-Nerode relation for R ⊆ Σ⋆ . Let M≡ be the corresponding
DFA. From M≡ if we now define the corresponding Myhill-Nerode relation,
say ≡M≡ , it is identical to ≡.
⇔ δ̂ ([ǫ], x) = δ̂ ([ǫ], y)
⇔ x ≡ y.
Lemma 9.10.5:
Let the DFA for R be M , with no inaccessible states. Let the corresponding
Myhill-Nerode relation be ≡M . From ≡M if we construct the corresponding
DFA say M≡M , it is isomorphic to M .
Proof.
[x] = {y | y ≡M x}
n o
= y | δ̂(s, y) = δ̂(s, x)
Q′ = {[x] | x ∈ Σ⋆ }
s′ = [ǫ]
F ′ = {[x] | x ∈ R}
δ ′ ([x], a) = [xa]
We now have to show that M≡M and M are isomorphic under the map:
By the definition of ≡M ,
(i) f (s′ ) = s
(ii) f δ̂ ([x], a) = δ f ([x]), a and
= δ̂ (s, ǫ) definition of f
= s, definition of δ̂
ii) f δ̂ ([x], a) = f ([xa]) , definition of δ ′
iff f ([x]) ∈ F, definition of f .
Theorem 9.10.6:
Let Σ be a finite alphabet. Up to isomorphism of automata, there is a 1–1
correspondence between DFA (with no inaccessible states) over Σ accepting
R ⊆ Σ⋆ and Myhill-Nerode relations for R on Σ⋆ .
Theorem 9.10.6 implies that we can deal with regular sets and finite
automata in terms of a few simple algebraic properties.
Chapter 9 Finite Automata 474
Definition 9.11.1:
A relation r1 is said to refine another relation r2 if r1 ⊆ r2 , considered as sets
of ordered pairs. That is, r1 refines r2 if for all x, y if x r1 y holds then x r2 y
also holds.
For equivalence relations ≡1 and ≡2 , this means that for every x, the
≡1 -class of x is included in the ≡2 -class of x.
For example, the equivalence relation i ≡ j mod 6 on the integers refines
the equivalence relation i ≡ j mod 3.
We will now show that there exists a coarsest Myhill-Nerode relation ≡R
for any given regular set R; that is, any other Myhill-Nerode relation for R
refines ≡R . The relation ≡R corresponds to the unique minimal DFA for R.
Property (b) of the definition of Myhill-Nerode relations says that a
Myhill-Nerode relation ≡ for R refines the equivalence relation with equiv-
alence classes R and Σ \ R. The relation of refinement between equivalence
relations is a partial order:
Lemma 9.11.2:
Let R ⊆ Σ⋆ , be any set, regular or not. Let the relation ≡R be defined as
follows:
For any x, y, z ∈ Σ⋆ , x ≡R y if and only if (xz ∈ R ⇔ yz ∈ R).
x ≡R y ⇒ ∀a ∈ Σ, ∀w ∈ Σ⋆ (xaw ∈ R ⇔ yaw ∈ R)
⇒ ∀a ∈ Σ (xa ≡R ya)
x ≡R y ⇒ (x ∈ R ⇔ y ∈ R) .
We now show ≡R is the coarsest such relation. That is any other equivalence
relation ≡ satisfying property (a) and (b) refines ≡R :
⇒ x ≡R y, by definition of ≡R .
(i) R is regular.
Proof. We first show (i) ⇒ (ii): Given a DFA M for R (because R is reg-
ular) we can construct ≡R , a Myhill-Nerode relation for R (as shown in
Section 9.11.2).
Chapter 9 Finite Automata 477
9.12.1 An Example
The intuitive argument to show that there exists no DFA M such that
L(M ) = A goes thus: if such an M exists then M has to remember when
passing through the centre point between the a’s and the b’s and how many
a’s it has seen. It has to do this for arbitrarily long strings an bn (n may
be arbitrarily large), much larger than the number of states. This is an un-
bounded amount of information and it is not possible to remember this with
only finite memory.
The formal argument is given below:
Chapter 9 Finite Automata 478
Assume that A is regular and assume that a DFA M exists such that
L(M ) = A. Let k be the number of states of this assumed DFA M . Consider
the action of M on the input an bn , where n ≫ k. Let the start state be s
and let the machine reach a final state r after scanning the input string an bn :
aaaaaaaaaaaaaaaa
| {z } bbbbbbbbbbbbbbb
| {z }
n n
s r
Figure 9.23:
Since n ≥ k, by pigeon hole principle, there must exist some state, say
p that M must enter more than once while scanning the input. We break
the string an bn into three pieces u, v, w where v is the string of a’s scanned
between the two occurrences of entry into state p. This is depicted below:
| {z } aaaaa
aaaaaa | {z } aaaa bbbbbbbbbbbbbbb
| {z }
u v w
s p p r
Figure 9.24:
We now show that the substring v can be deleted and the resulting string
will still be erroneously accepted:
δ̂(s, uw) = δ̂ δ̂(s, u), w = δ̂(p, w) = r ∈ F
The acceptance is erroneous because after deleting v, the number of a’s in the
resulting string is strictly less than the number of b’s. This is a contradiction
Chapter 9 Finite Automata 479
(i) xy i z ∈ A, i ≥ 0
(iii) |xy| ≤ p
w= w1 w2 w3 w4 w5 b b b wn
Figure 9.25:
Let us “pump in” a copy of y into w—we get the string xyyz. The extra
copy of y will start in state q10 and will end in q10 . So z will still start in
state q10 and will go to the accept state q14 . Thus xyyz will be accepted.
Similar reasoning shows that the string xy i z, i > 0 will be accepted. It is
easy to reason that xz is also accepted. Thus condition (i) is satisfied.
Since y is the part occurring between the two different occurrences of
state q10 , |y| > 0 and so the condition (ii) is satisfied.
To get condition (iii), we make sure that q14 is the first repetition in the
Chapter 9 Finite Automata 481
sequence. By pigeon hole principle, the first p+1 states in the sequence must
contain a repetition. Therefore |xy| ≤ p.
We now formalize the above ideas.
As before, let w = w1 w2 w3 · · · wn , n ≥ p. Let r1 (= s), r2 , r3 , . . . , rn+1
be the sequence of states that M enters while processing w, so that ri+1 =
δ(ri , wi ), 1 ≤ i ≤ n. This sequence of states has length n + 1 which is at least
p + 1. By the pigeon hole principle there must be two identical states among
the first p + 1 states in the sequence. Let the first occurrence of the repeated
state by ra and the second occurrence of the same state by rb . Because ra
occurs among the first p + 1 states starting at r1 we have a ≤ p + 1. Now let,
x = w1 w2 · · · wa−1
y = wa · · · wb−1
z = w b · · · wn
482
BIBLIOGRAPHY 483
[26] J. L. Hein, Discrete Structures, Logic and Computability, 2/e, Jones and
Bartlett, 2002.
[29] K. Hoffman and R. Kunze, Linear Algebra, Prentice Hall of India, East-
ern Economy Edition, New Delhi, 1967.
[35] N. Jacobson, Basic Algebra, 2/e, vol. I & II, Freeman, San Francisco,
1985.
[63] B. L. van der Waerden, Modern Algebra, vol. I & II, Ungar, New York,
1970.
[65] J. von zur Gathen and J. Gerhard, Modern Computer Algebra, Cam-
bridge University Press, 1999.
[66] D. Welsh, Codes and Cryptography, Clarendon Press, New York, 1988.