Discrete Mathematics Graph Algorithms, Algebraic Structures, Coding Theory, and Cryptography
Discrete Mathematics Graph Algorithms, Algebraic Structures, Coding Theory, and Cryptography
R. Balakrishnan
Bharathidasan University, Tiruchirappalli, Tamil Nadu, INDIA
Sriraman Sridharan
Laboratoire LAMPS, Département de Mathématiques et
d’Informatique, Université de Perpignan Via Domitia,
Perpignan, FRANCE
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
c 2020 by Taylor & Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, an Informa business
This book contains information obtained from authentic and highly regarded sources. Reasonable
efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and
publishers have attempted to trace the copyright holders of all material reproduced in this publication
and apologize to copyright holders if permission to publish in this form has not been obtained. If any
copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information stor-
age or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access
www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc.
(CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization
that provides licenses and registration for a variety of users. For organizations that have been granted
a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and
are used only for identification and explanation without intent to infringe.
Preface xix
Acknowledgment xxiii
Authors xxv
1 Graph Algorithms I 1
1.1 Representation of Graphs . . . . . . . . . . . . . . . . . . . . 1
1.2 Minimum Spanning Tree Algorithms . . . . . . . . . . . . . . 9
1.2.1 Prim’s minimum spanning tree algorithm . . . . . . . 11
1.2.2 Kruskal’s minimum spanning tree algorithm . . . . . . 19
1.2.3 Rooted ordered trees and traversal of trees . . . . . . 22
1.3 Shortest Path Algorithms . . . . . . . . . . . . . . . . . . . . 23
1.3.1 Single-source shortest path algorithm . . . . . . . . . 24
1.4 Dijkstra’s Algorithm for Negative Weighted Arcs . . . . . . . 33
1.5 All-Pairs Shortest Path Algorithm . . . . . . . . . . . . . . . 35
1.5.1 An application of Floyd’s algorithm . . . . . . . . . . 43
1.6 Transitive Closure of a Directed Graph . . . . . . . . . . . . 45
1.7 An O(n3 ) Transitive Closure Algorithm Due to Warshall . . 47
1.8 Navigation in Graphs . . . . . . . . . . . . . . . . . . . . . . 50
1.9 Applications of Depth-First Search . . . . . . . . . . . . . . . 55
1.9.1 Application 1: Finding connected components . . . . . 55
1.9.2 Application 2: Testing acyclic graph . . . . . . . . . . 56
1.9.3 Application 3: Finding biconnected components of a
connected multigraph . . . . . . . . . . . . . . . . . . 58
1.10 Depth-First Search for Directed Graphs . . . . . . . . . . . . 68
1.11 Applications of Depth-First Search for Directed Graphs . . . 70
1.11.1 Application 1: Finding the roots of a directed
graph . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
1.11.2 Application 2: Testing if a digraph is without
circuits . . . . . . . . . . . . . . . . . . . . . . . . . . 72
1.11.3 Application 3: Topological sort . . . . . . . . . . . . . 72
1.11.3.1 An application of topological sort: PERT . . 76
vii
viii Contents
6 Cryptography 249
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
6.2 Some Classical Cryptosystems . . . . . . . . . . . . . . . . . 249
6.2.1 Caesar cryptosystem . . . . . . . . . . . . . . . . . . . 249
6.2.2 Affine cryptosystem . . . . . . . . . . . . . . . . . . . 250
6.2.3 Private key cryptosystems . . . . . . . . . . . . . . . . 251
6.2.4 Hacking an affine cryptosystem . . . . . . . . . . . . . 252
6.3 Encryption Using Matrices . . . . . . . . . . . . . . . . . . . 253
6.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
6.5 Other Private Key Cryptosystems . . . . . . . . . . . . . . . 255
6.5.1 Vigenere cipher . . . . . . . . . . . . . . . . . . . . . . 255
6.5.2 The one-time pad . . . . . . . . . . . . . . . . . . . . 256
6.6 Public Key Cryptography . . . . . . . . . . . . . . . . . . . . 256
6.6.1 Working of public key cryptosystems . . . . . . . . . . 257
6.6.1.1 Transmission of messages . . . . . . . . . . . 257
6.6.1.2 Digital signature . . . . . . . . . . . . . . . . 257
6.6.2 RSA public key cryptosystem . . . . . . . . . . . . . . 258
6.6.2.1 Description of RSA . . . . . . . . . . . . . . 258
6.6.3 The ElGamal public key cryptosystem . . . . . . . . . 260
6.6.4 Description of ElGamal system . . . . . . . . . . . . . 261
6.7 Primality Testing . . . . . . . . . . . . . . . . . . . . . . . . 261
6.7.1 Non-trivial square roots (mod n) . . . . . . . . . . . . 261
6.7.2 Prime Number Theorem . . . . . . . . . . . . . . . . . 262
6.7.3 Pseudo-primality testing . . . . . . . . . . . . . . . . . 262
6.7.3.1 Base-2 Pseudo-prime test . . . . . . . . . . . 263
6.7.4 Miller-Rabin Algorithm . . . . . . . . . . . . . . . . . 263
6.7.5 Horner’s method to evaluate a polynomial . . . . . . . 263
6.7.6 Modular exponentiation algorithm based on repeated
squaring . . . . . . . . . . . . . . . . . . . . . . . . . . 265
6.8 The Agrawal-Kayal-Saxena (AKS) Primality Testing
Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
6.8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 267
Contents xi
Bibliography 307
Index 311
List of Figures
xiii
xiv List of Figures
xvii
xviii List of Tables
xix
xx Preface
this system as well as the ElGamal cryptosystem. RSA is built on very large
prime numbers. So, this gives rise to the following natural question: Given a
large positive integer, how do we test if the given number is prime or not?
We briefly discuss the Miller-Rabin primality testing algorithm. This is a ran-
domized probabilistic algorithm to test if a given number is prime or not.
However, the problem of finding a deterministic polynomial-time algorithm to
test if a given number is prime or not remained unsolved until the Agrawal-
Kayal-Saxena (AKS) primality testing algorithm was proposed in 2002. We
present this algorithm, its proof, and some illustrative examples.
Parts of this book have been taught in Indian and French universities. The
authors thank N. Sridharan for going through some of the chapters and offer-
ing constructive suggestions, and Jaikumar Radhakrishnan and Arti Pandey
for their inputs on Chapter 6 to cryptography. They also thank A. Anu-
radha, R. Dhanalakshmi, N. Geetha, and G. Janani Jayalakshmi for their
help in typesetting. We take this opportunity to thank our institutions,
Bharathidasan University, Tamil Nadu, India and Université de Perpignan
Via Domitia, France, for their academic support. Our thanks are also due
to the faculties in our departments whose encouragement proved vital to
attain our goal. We also thank the four anonymous reviewers for suggest-
ing some changes on the initial version of the book. Last but not least, we
thank Aastha Sharma and Shikha Garg of CRC Press for their kind under-
standing of our problems and for their patience until the completion of our
manuscript.
The second author (S.S.) expresses his deep gratitude to (Late) Professor
K. R. Parthasarathy of the Indian Institute of Technology, Chennai, India,
for introducing him to Graph Theory and guiding his Ph.D. thesis. He is
also indebted to (Late) Professor Claude Berge, one of the greatest pio-
neers in Graph Theory and Combinatorics, who invited him to CAMS (Cen-
tre d’Analyse et de Mathématique Sociale) and guided his doctoral work in
Paris. Claude Berge had been a source of immense inspiration to him. Spe-
cial thanks are also due to Professors R. Balasubramanian (A. M. Jain Col-
lege, TN, India), Philippe Chrétienne (Université de Pierre et Marie Curie),
Robert Cori (Université de Bordeaux I), Alain Fougère (UPVD), Michel
Las Vergnas (CNRS), UFR secretaries Mme. Fabienne Pontramont (UPVD),
Mme. Dominique Bevilis (UPVD), Mircea Sofonea (Directeur de Laboratoire
LAMPS, UPVD), Michel Ventou (UPVD), Annick Truffert (Dean of the Fac-
ulty of Sciences, UPVD). Many thanks are also due to my students of UPVD
for their feedback. We are responsible for all the remaining errors, but still,
we feel that the initial readers of this book could smoke out some more bugs.
Though it is not in the Hindu custom to explicitly thank family members,
he would like to break this tradition and thank his wife, Dr. Usha Sridharan,
and his daughters, Ramapriya and Sripriya, for putting up with unusually
prolonged absence, as well as Dheeraj.
The entire book was composed using the TEX and LATEX systems developed
by D. E. Knuth and L. Lamport.
xxiii
xxiv Acknowledgment
R. Balakrishnan
Sriraman Sridharan
Tiruchirappalli, Tamil Nadu, India
Perpignan, France
August 2018
Authors
xxv
Chapter 1
Graph Algorithms I
The aim of physical sciences is not the provision of pictures, but the
discovery of laws governing the phenomena and the application of
these laws to discover new phenomena. If a picture exists, so much the
better. Whether a picture exists or not is only a matter of secondary
importance.
P. Dirac
1
2 Discrete Mathematics
1 2 3 4
⎛ ⎞
1 1 1 0 1
2⎜0 0 0 0⎟
M= ⎜ ⎟
3⎝0 2 0 0⎠
4 0 0 1 0
Here, 4 is the number of vertices of the graph. The (i, j) entry of the above
matrix M is simply the number of arcs with its initial vertex at i and the
terminal vertex at j. This matrix is called the adjacency matrix of the graph
of figure.
More generally, for a n vertex graph G with vertex set X = {1, 2, . . . , n},
the adjacency matrix of G is the n × n matrix M = (mij ) where
Memory space for the adjacency matrix: Since an n × n matrix has exactly n2
entries, the memory space necessary for the adjacency matrix representation
of a graph is of order O(n2 ). The time complexity of initializing a graph by
its adjacency graph is O(n2 ). This may preclude algorithms on graphs whose
complexities are of order strictly less than n2 .
Properties of the adjacency matrix: Let M denote the adjacency matrix of
a graph with vertex set X = {1, 2, . . . , n}. Then, by the definition of the
adjacency matrix, we have the following properties:
1. The sum of the entries of the ith row of M is equal to the out-degree of
the vertex i.
2. The sum of the entries of the jth column of M is equal to the in-degree
of the vertex j.
3. The sum of all the entries of the matrix M is the number of arcs of the
graph.
Graph Algorithms I 3
u u2 u3 u4 u5 u6
⎛ 1 ⎞
1 1 2 1 0 0 0
2⎜ 0 0 −1 −1 −1 0 ⎟
M= ⎜ ⎟
3⎝ 0 0 0 1 1 −1 ⎠
4 −1 0 0 0 0 1
1. The sum of entries of every column except the ones which represent
loops is 0.
2. The sum of entries of every column representing a loop is 2.
3. The sum of entries of the ith row (not containing the entry 2) is d+ (i) −
d− (i).
Example 1.2
Consider the graph G of Figure 1.3 by ignoring the orientations of
the arcs and taking uj = ej . Then, the incidence matrix M of G is
e1 e2 e3 e4 e5 e6
⎛ ⎞
1 1 2 1 0 0 0
2⎜ 0 0 1 1 1 0⎟
M= ⎜ ⎟
3⎝ 0 0 0 1 1 1⎠
4 1 0 0 0 0 1
2. The sum of the entries of the i-th row of an incidence matrix is the
degree of the vertex i which is d(i).
succ(i) if and only if there are k arcs with initial vertex i and final vertex j,
that is, the multiplicity of the arc (i, j) is k.
When the sage points his finger to the Moon, the novice looks at its
finger.
Tamil Proverb
(since the sum of the out-degrees of the vertices of a graph is equal to the
number of arcs).
Hence, we see that the adjacency list representation uses only linear space
O(m + n), whereas the adjacency matrix needs a quadratic space O(n2 ).
#include <stdio.h>
#define max_n = 20 /* maximum number of vertices */
#define max_m = 100 /* maximum number of arcs */
struct node
{int v; struct node *next;};/* v for vertex */
int n, m; /*n = number of vertices. m = number of arcs */
struct node *succ[max_n]; /* graph is represented as an
array of pointers */
/* We work with a global graph.Otherwise the graph should be
declared as a variable parameter*/
void adj_list ( )
{
int i, a, b; /* i for the loop.(a,b) is an arc.*/
struct node *t; /* t, a temporary pointer */
printf("Enter two integers for n and m\n");
scanf(%d %d\n", &n, &m); /* read n and m */
/* initialize the graph with n vertices and 0 arcs */
for (i = 1; i <= n; i++) succ[i] = NULL;
/* read the m arcs */
for (i = 1; i <= m; i++)
{
printf("Enter the arc number %d ", i);
scanf("%d %d \n", &a, &b);/* un arc is an ordered pair of
vertices. b will be in succ(a) */
/* create a node referenced by t*/
t = (struct node *) malloc(sizeof *t);
8 Discrete Mathematics
void print_list ( )
{
/* print_list writes the list of successors of
each vertex of the graph G*/
int i; /* i for the loop */
struct node *t; /* t, a temporary pointer */
for (i = 1; i <= n; i++)
{
/* write the list succ[i]*/
t = succ[i];
if (t == NULL)
printf(" No successors of %d\n ", i);
else
{
printf(" The successors of %d are :", i);
/* scan the list succ[i] and write the v fields of nodes */
while (t != NULL)
{
printf("%d ", t->v);
t = t->next; /* move t to next node */
}/*while*/
printf("\n");
}/*else*/
}/*for*/
}/* print_list*/
int main( )
{
adj_list ( ); /* call */
print_list ( ); /* call */
}
#include <stdio.h>
#define max_n = 20 /* maximum number of vertices*/
int i, j, n, m;/* n, the number of vertices. m, the number of edges
i, j for the loops*/
Graph Algorithms I 9
void print_adj_matrix ( )
{
int i, j;/* i, j for the loops */
for (i = 1; i <= n; i++)
{ for( j =1; j <= n; j++)
printf("%d ", adj[i][j]);
printf("\n");
}
}
int main ( )
{
adj_matrix( );/* call */
print_adj ( ); /* call */
}
In the tree of Figure 1.7 with ten vertices, we observe the following
properties:
1. There is a unique elementary walk, that is, a path between any two
vertices.
2. The number of edges is exactly one less than the number of vertices.
We now prove the above two properties for any general tree.
Proposition 1.1. In any tree, there is a unique path joining any two vertices.
Proposition 1.2. In any tree with n vertices, the number of edges m is exactly
n − 1.
Proof. The proof proceeds by induction on the number of vertices of the tree.
Basis: If the number of vertices n = 1, then the tree consists of only one vertex
and no edges, that is, the number of edges m = 0. Hence the proposition is
trivially satisfied.
Induction hypothesis: Suppose the proposition is true for all trees with the
number of vertices n at least two.
We shall prove the proposition for trees with n+1 vertices. Consider a tree
T with n + 1 vertices. We claim that the tree T contains a vertex of degree
one. If not, the degree of each vertex is at least two and by Lemma 1.1, T
contains a cycle, a contradiction. Let x be a vertex of degree one in the tree
T. Then, the vertex deleted graph T − x is still a tree. Note that T and T − x
differ by exactly one vertex x and exactly one edge, the edge incident with
x. Now T − x is a tree on n vertices and the number of edges m of the tree
T − x, verifies, by induction hypothesis, the following equation,
m = n − 1.
Adding one on both sides, we get m + 1 = (n + 1) − 1, that is, the number
of edges of T is exactly one less than the number of vertices of T. Thus, the
proof is complete.
Spanning tree of a connected graph:
Consider a connected graph G. A spanning subgraph of G, which is a tree,
is called a spanning tree of G. In a connected graph, a spanning tree always
exists as we shall see below:
Since a tree is acyclic, we intuitively feel that if we remove all the cycles of
the graph G without disconnecting G, we will be left with a spanning tree. In
fact, this is the case.
Consider any elementary cycle of G and delete exactly one of its edges.
This results in a connected spanning subgraph G1 of G, since in an elementary
cycle, between any two vertices of the cycle, there are exactly two different
internally vertex-disjoint paths. If the resulting graph is a tree, then we have
obtained a desired spanning tree. Otherwise, we find an elementary cycle of
G1 and delete one of its edges. This gives us a connected spanning graph G2 .
If G2 is a tree, then G2 is a desired spanning tree of G. Otherwise, we continue
the procedure as before till a spanning tree of G is found. If a graph G contains
a spanning tree, then clearly it must be connected. Thus, we have proved the
following result.
Theorem 1.1. A graph G has a spanning tree if and only if it is connected.
called networks. The vertices are interpreted as cities and the edges as high-
ways connecting its end vertices. The weight associated with an edge is inter-
preted as the cost of constructing a highway connecting its end vertices (cities).
The total cost of the network is the sum of the costs on each edge of G. For
example, the total cost of the figure G is 55 (see Figure 1.8).
Generate all possible spanning trees of G and find the total cost of each of these
spanning trees. Finally, choose one tree for which the total cost is minimum.
This procedure will work “quickly” if the number of vertices of the input graph
G is sufficiently small. If the number of vertices is not sufficiently small, this
procedure takes an enormous amount of time, that is, exponential time since
by Caley’s theorem [1] there are nn−2 possible non-identical trees on n given
vertices.
Observation 1.1. Trees can be grown by starting with one vertex (initializa-
tion) and adding edges and vertices one-by-one (iteration).
According to Observation 1.1, we start with any vertex, say the vertex 1. The
vertex 1 forms a tree T by itself. The algorithm chooses an edge e1 of the
graph G such that one end of the edge e1 lies in tree T and the other end
lies outside of the tree T (this is to avoid cycle in T ) with the weight of e1
minimum. Then, we add the edge e1 to the tree T. In our figure, the edge
e1 = 16. Then, we select an edge e2 such that one end of the edge e2 lies
in the tree T under construction, and the other end lies outside of the tree
14 Discrete Mathematics
T with the weight of e2 minimum. In our case, the edge e2 = 17 and so on.
The different steps of the tree-growing procedure are illustrated in Table 1.1.
In Table 1.1, the set S denotes the vertices of the tree T under construction.
T denotes the set of edges of the tree which is grown by the algorithm, s and
t denote vertices of the graph G with s ∈ S and t ∈ X \ S, and X represents
the set of vertices of the graph G. The procedure ends as soon as the tree T
under construction contains all of the vertices of the graph G. The edges of
a minimum spanning tree obtained are found in the 6th column of iteration
number 6 in Table 1.1. The minimum spanning tree obtained by Table 1.1 is
shown in Figure 1.9 and its cost is the sum of the costs of its edges which
is 27. Now let us write the algorithm in pseudo-language.
We shall now consider the instruction number (1) of the while loop of Prim.
This statement can be implemented in O(n) time. Hence, each execution of
the while loop demands O(n) time (Recall that when we compute complexity,
the lower-order terms and multiplicative constants can be neglected [6]).
Therefore, complexity of Prim’s algorithm is (n − 1)O(n) = O(n2 ), which
is a polynomial of degree 2.
edge e = xy, then we are done. Otherwise, the tree T surely will contain an
edge e = x y having one end x in S and the other end y outside of S, for,
if not, the tree T will not be connected. (In fact, more generally, a graph is
connected, if and only if for any partition of the vertex set into two proper
subsets, there is an edge having one end in one subset of the partition and the
other end in the other subset of the partition.) Since the weight of the edge
e is a minimum among all edges joining S and and its complement, we have
the weight of e ≥ the weight of e.
Now consider the spanning subgraph obtained by adding the edge e to
the tree T , that is, consider the graph G = T + e. Then, G contains exactly
one elementary cycle. This is because of the following argument: By Propo-
sition 1.1, in the spanning tree T , there is a unique path P joining the end
vertices x and y of the edge e. But then, G = T + e contains the unique ele-
mentary cycle P + e (A path plus the edge joining the initial and final vertices
of the path is an elementary cycle) (see Figure 1.10).
Now form the graph T = G − e . T is still connected, because the end
vertices x and y of the deleted edge e are connected by the path P + e − e
in the graph T . Moreover, T is a tree, because the only cycle P + e of G is
destroyed by the removal of the edge e of this cycle. Not that the trees T and
T differ by only two edges e and e . Since the weight of the edge e ≤ the weight
of the edge e , we have, the weight of the tree T ≤ the weight of the tree T .
Since T is a minimum weight spanning tree, we must have equality of the
weights of T and T . Thus, we have constructed a minimum spanning tree T
containing the edge e.
of any edge. We define two arrays dad and cost mins. dad[i] gives the vertex
in S that is currently “closest” (with respect to the cost of edges) to vertex i
in X \ S. cost min[i]= the cost of the edge (i, dad[i]).
/* initialization of M*/
for (i=1;i<=n;i++)
for ( j=1;j<=n;j++)
M[i][j]=INT_MAX;
/*end of initialization */
for (j=1;j<=n;j++)
printf ("%5d ",M[i][j]);
printf ("\n\n\n");
}
}
//begin Prim
void prim()
{ int dad[n_max], cost_min[n_max];
//array dad represents the tree under construction
int i,j,k,min;
//initialization
for(i=1;i<=n;i++)
for(i=1;i<=n;i++)
{//find a vertex k outside the tree to be added to the
//tree
k=2;
min=cost_min[2];
for(j=3;j<=n;j++)
if(cost_min[j]<min)
{
k=j;
min=cost_min[j];
}
//print edge
printf("%d%d\n",k,dad[k]);
cost_min[k]=INT_MAX;//k is added to tree
//update arrays cost_min, dad
for(j=2;j<=n;j++)
if((M[k][j]<cost_min[j])&&(cost_min[j]<INT_MAX))
{
cost_min[j]=M[k][j];
dad[j]=k;
}
Graph Algorithms I 19
}
//end of prim
int main ()
{
graph () ;
print_graph () ;
prim();
return 0;
Kruskal’s Algorithm
procedure Kruskal( var G: Graph; var F: Set of Edges);
(* Kruskal’ takes as input a connected weighted graph G and
outputs a minimum spanning tree F*)
var PX : Partition of the vertex set X;
(* PX denotes the partition of the vertex set X, induced by
the connected components of the forest F*)
e : Edge; L : List of Edges;
x,y : Vertex;(* x, y are the ends of edge e*)
begin (* Kruskal *)
(* initialization of PX and F *)
PX := empty_set; F := empty_set;
Form a list L of the edges of G in increasing order of cost;
(* Forest consists of isolated vertices and no edges *)
for each vertex x of the graph G do
add {x} to PX;
(* Iteration *)
while | PX | > 1 do
begin (* while *)
(1) choose an edge e = x y, an edge of minimum cost from the
list L;
(2) delete the edge e from the list L;
(3) if x and y are in two different sets S1 and S2 in PX
then begin
(4) replace the two sets S1 and S2 by their union S1
U S2;
add the edge x y to F;
end;
end;(* while *)
end;
Note that the root r comes first in preorder, last in postorder. In the inorder,
the root r comes after the listing of vertices of the first subtree of r[3].
greedy algorithm. We use “cheapest cost path” as synonym for the “shortest
path.” “Path” always means in this section a “directed path.”
In an arborescence, there is a unique path from the root to all the other ver-
tices. In fact, Dijkstra’s algorithm constructs a spanning arborescence of the
given graph G having its root as the source of the graph G and its unique path
from the root to a vertex x as a cheapest path in the graph G. In the algorithm,
we use the variable S denoting the set of vertices of a cheapest arborescence
under construction. (Note that in Prim’s algorithm also the variable S denotes
the set of vertices of a minimum spanning tree under construction.)
The variable S satisfies the following two properties, which can be consid-
ered as loop invariant:
1. For all vertices s in the set S, a cheapest cost of a path (also called
shortest path) from the source 1 to s is known and this path lies entirely
in the set S, that is, all the vertices and the arcs of a shortest path
from 1 to s lies fully in S. This condition implies that the set S must
contain the source vertex 1. In fact, the assignment S := {1} is used as
initialization of the set S.
2. For a vertex t outside of the set S (if such a vertex exists), a path relative
to the set S is a path from the source vertex 1 to the vertex t such that
all the vertices of this path lie completely in the set S but for the last
vertex t. We shall now state the second property of the set S in terms
of paths relative to the set S. For each vertex t outside of the set S, a
shortest path from 1 to t relative to S lies completely in the set S except
for the final vertex t.
Interpretation of two properties of S when S equals the whole of the vertex set
X of G:
According to the first property of S = X, a shortest path from the origin
vertex 1 to any vertex x of the graph is known, and we have solved the
problem! Note that the second property is not applicable, since X \ S = ∅.
Hence, the terminal condition of the algorithm is S = X.
In the Dijkstra’s algorithm, the following elementary observation is used:
A subpath of a shortest path is a shortest path!
26 Discrete Mathematics
Note that the above observation will not be valid if the weights of the arcs
are not non-negative numbers. Another property on which Dijkstra’s algo-
rithm works is the following: This is the point where the algorithm becomes
greedy!
Consider a shortest path P among the shortest paths relative to the set
S. Then, the path P is a shortest path in G itself.
Let us write Dijkstra’s algorithm in pseudo-code. The weighted graph is
represented by its weighted adjacency matrix W = (wij )n×n where
⎧
⎨0 if i = j
wij = weight of the arc ij if ij is an arc
⎩
∞ otherwise.
Remark 1.1. The vertex y in Dijkstra’s algorithm is called the pivot vertex.
Of course, the pivot changes at each iteration.
Let us now execute Dijkstra’s algorithm on the graph of Figure 1.14. The
following Table 1.4 traces the different steps of the algorithm.
The final result is found in columns 5, 6, 7, 8, 9 of iteration number 5 in
Table 1.4.
1. The vertices are added to the set S in order of their increasing costs
of D.
2. Once a vertex x is added to the set S, the value of D[x] will remain
constant till the end of the algorithm. Hence the set S can be interpreted
as the set of Stabilized vertices.
1. For each vertex, s in S, the cost D[s] of a cheapest path from the vertex 1
to the vertex s is known, and this path lies entirely within the set S.
2. For all t outside of the set S, the cost D[t] of the cheapest path relative
to S is known.
through the vertex s, then goes directly to the vertex z, then again goes
possibly inside the set S and finally reaching the vertex y. The directed path
from the vertex 1 to the vertex y relative to the set S goes through the vertex t
and then directly reaches the vertex y from the vertex t through an arc.
Let z be the first vertex of path P outside of S. Then, the subpath P (1, z)
of P from 1 to z is a path relative to S and its cost D[z] is clearly strictly less
than the length of the path P which is strictly less than the number D[y], by
induction hypothesis. This is impossible since D[y] was a minimum among all
vertices outside of the set S during the (k + 1)-th iteration (statement number
(1) of the procedure Dijkstra). Note that we use here the fact that the costs
of the arcs are non-negative numbers. Otherwise, our argument will not be
valid. Hence the first property of the set S after the iteration number k + 1.
The second property of S remains true even after adding the pivot ver-
tex, y, because of statement number (4) of the procedure Dijkstra of Section
1.3.3, which adjusts the cost of each D[x], x ∈ X \ S to take into account the
arrival of the pivot vertex into the set S.
Recovering paths in Dijkstra’s algorithm:
Dijkstra’s algorithm finds the cost of a cheapest path from the source vertex 1
to all vertices of a weighted digraph. We are now interested in finding paths
realizing the minimum costs.
An arborescence can be conveniently represented by an array “dad” where
dad[i] gives the father of the vertex i. Stated differently, dad[i] = j if and only
if (i, j) is an arc of the arborescence. Initially, the paths relative to S = {1}
are the arcs (1, i). Hence, we add the following loop in the initialization part:
for i := 2 to n do
dad[i] := 1; (* The vertex 1 is the root of the arborescence
and hence has no dad*)
for each x in X - S do
if D[y] + w(y, x) < D[x]
then
begin
D[x] := D[y] + w(y, x);
dad[x] := y;(* update dad of x*)
end;
30 Discrete Mathematics
Once we have recorded the dads of different vertices, a shortest path from the
vertex 1 to any vertex i can be printed by “climbing” up the arborescence
from the vertex i till the root 1. The following procedure prints the vertices
of a shortest path from the vertex 1 to the vertex i. The vertices which are
encountered while climbing up the tree will be pushed into the array a and
finally will be popped to get the right order of vertices on the path from the
vertex 1 to the vertex i.
Dijkstra’s algorithm in C:
#include<stdio.h>
#include<stdlib.h>
#include<limits.h>
#define nmax 10//maximum vertices
int M[nmax][nmax];//cost matrix
int n,m;//number of vertices and arcs
void dijkstra(){
int D[n+1],S[n+1],dad[n+1],stack[n+1];
//S[i]=1 if i in S, otherwise S[i]=0.
Graph Algorithms I 31
for (i=2;i<=n;i++)
{printf("%5d ",D[i]);
}
stack[++head]=i;//i=1
//print the path from 1 to i
printf("the path is ");
for(i=head;i>0;i--)
printf("%d ",stack[i]);
void graph(){
int i,j,x,y,c;
printf("enter n",\n);
scanf("%d",&n);
printf("\n");
printf("enter m",\n);
scanf("%d",&m);
for(i=1;i<=n;i++)
for(j=1;j<=n;j++)
{
M[i][j]=INT_MAX;
if(i==j)
M[i][j]=0;
}
for(i=1;i<=m;i++)
{
printf("enter the arc %d",i) ;
scanf("%d %d",&x,&y) ;
printf("enter the cost %d",i) ;
scanf("%d",&c) ;
M[x][y]=c;
}}
void print_graph()
{
int i,j;
for(i=1;i<=n;i++)
{
for(j=1;j<=n;j++)
printf("%5d ",M[i][j]);
printf("\n\n\n");
}
}
Graph Algorithms I 33
int main()
{
graph();
print_graph();
dijkstra();
return 0;
}
Fact 1.1. In the case of graphs with negative-weight arcs, the “greedy” state-
ment that the shortest path among the shortest paths relative to S is a shortest
path in G is no longer valid!
The following algorithm computes the cost of a shortest path from the
source vertex 1 to all other vertices of the graph. As in Dijkstra’s algorithm
for graphs with non-negative weights, we use the variables S, the array D,
and the variable vertices x, y and the integer variable i.
Algorithm 1.3 (Floyd’s Algorithm). For a directed path from the vertex
i to the vertex j, the set of vertices of the path other than the initial vertex of
the path i and the final vertex of the path j are called its intermediate vertices.
The algorithm constructs a sequence of n + 1 matrices M0 , M1 , . . . , Mn
(k) (k)
where in the matrix Mk = (mij )n×n where mij = The cost of a shortest
directed path from the vertex i to the vertex j not passing through any inter-
mediate vertices k + 1, k + 2, . . . , n, that is, the cost of a cheapest dipath from i
to j where the set of its intermediate vertices is a subset of the set {1, 2, . . . , k}.
Interpretation of M0 :
(0) (0)
M0 = (mij )n×n , where mij is the weight of a shortest dipath from the ver-
tex i to the vertex j not passing through any intermediate vertices 1, 2, . . . , n.
(0)
Since the vertex set of the graph is X = {1, 2, . . . , n}, this means that mij
is the weight of the shortest dipath not using any intermediate vertices at
(0)
all, that is, mij is the weight of (i, j), if (i, j) is an arc. Hence, M0 = W,
the weighted adjacency matrix. This assignment is used as initialization in
Floyd’s algorithm.
Interpretation of Mn :
(n) (n)
By definition, Mn = (mij ), where mij is the weight of a shortest path from
the vertex i to the vertex j not passing through any intermediate vertices of
the set {n+1, n+2, . . .}. But this set is the empty set ∅, because the vertex set
(n)
of the graph is X = {1, 2, . . . , n}. Hence, mij is simply the cost of a shortest
dipath whose internal vertices are from the set of vertices {1, 2, . . . , n} of the
(n)
graph G. This means that mij is the cost of a shortest dipath in the graph G.
Therefore, the matrix Mn is the desired result of the algorithm.
(k)
We shall now see how to find the matrix Mk = (mij ), given that the
(k−1)
matrix Mk−1 = (mij ), that is, how to move from the (k − 1)-th iteration
to the k-th iteration?
To do this, we again exploit the definition of the matrices Mk−1 and Mk .
Consider a shortest cost path P(k−1) (i, k) from i to k not passing through any
intermediate vertices k, . . . , n and a shortest cost path P(k−1) (k, j) from the
vertex k to the vertex j not passing through any intermediate vertices k, . . . , n.
For the path P(k−1) (i, k), k is the final vertex and for the path P(k−1) (k, j) the
vertex k is the initial vertex. The concatenation or juxtaposition of the dipaths
P(k−1) (i, k) and P(k−1) (k, j) taken in this order which gives us either a directed
elementary path from i to j with k as an intermediate vertex (see Figure 1.18)
Graph Algorithms I 37
Example 1.3
Execution of Floyd’s algorithm:
Consider the graph of Figure 1.19.
The graph is represented by the weighted adjacency matrix W where
1 2 3 4 5
⎛ ⎞
1 0 7 10 ∞ ∞
2⎜
⎜∞ 0 2 ∞ 4 ⎟
⎟
W = 3⎜
⎜ ∞ ∞ 0 ∞ 8 ⎟
⎟.
4⎝∞ ∞ 8 0 ∞⎠
5 3 ∞ ∞ 3 0
The indices in the entry represent the last iteration number that changed
the entry. These indices are useful in finding the paths realizing the min-
imum costs. Initially, M0 = W. Can we use the vertex 1 as an interme-
diate vertex to reduce the entries of the matrix M1 ?
Iteration 1: ⎡ ⎤
0 7 3 ∞ ∞
⎢∞ 0 2 ∞ 4 ⎥
⎢ ⎥
⎢
M1 = ⎢∞ ∞ 0 ∞ 8 ⎥ ⎥.
⎣∞ ∞ 8 0 ∞ ⎦
3 101 131 3 0
Can we use some vertices of the set {1, 2} as intermediate vertices to
reduce the entries of the matrix M1 ?
Iteration 2: ⎡ ⎤
0 7 92 ∞ 112
⎢∞ 0 2 ∞ 4 ⎥
⎢ ⎥
⎢
M2 = ⎢∞ ∞ 0 ∞ 8 ⎥ ⎥.
⎣∞ ∞ 8 0 ∞ ⎦
3 101 122 3 0
40 Discrete Mathematics
With this definition, the matrix INTER of the graph Figure 1.19 is given by
⎡ ⎤
0 0 2 5 2
⎢5 0 0 5 0 ⎥
⎢ ⎥
INTER = ⎢
⎢5 5 0 5 0 ⎥
⎥
⎣5 5 0 0 3 ⎦
0 1 4 0 0
Graph Algorithms I 41
This matrix INTER will be used to recover paths realizing the cheapest costs.
Note that if an (i, j) entry of the matrix INTER is 0, then the path realizing
the minimum cost from the vertex i to the vertex j is just the arc (i, j), with
no intermediate vertices.
For example, the cost of a cheapest path from the vertex 4 to the vertex
2 is the (4,2) entry of the matrix M5 , which is 26. What are the intermediate
vertices of a path from 4 to 2 realizing this cost 26? To find the intermediate
vertices, we read first the (4,2) entry of the matrix INTER which is 5. This
means that 5 is an intermediate vertex between 4 and 2. Now we read the
entry (4,5) of the matrix INTER to find an intermediate vertex between 4
and 5 which is 3. Hence, 3 is an intermediate vertex between 4 and 5. Next,
we read the entry (4,3) which 0. This means that there is no intermediate
vertex between 4 and 3. We read the entry (3,5) which is again 0. We read the
entry (5,2) which is 1. This means that 1 is an intermediate vertex between
5 and 2. Now the entries (5,1) and (1,2) are 0 meaning that no intermediate
vertices are found between 5,1 and 1,2.
Hence, the intermediate vertices of a cheapest path from 4 to 2 are 3, 5, 1
and the cheapest cost path is (4, 3, 5, 1, 2).
Let us now write an algorithm to find the matrix INTER. To do this, we
have only to initialize the matrix INTER and update the entries of INTER
at the appropriate point in Floyd’s algorithm.
Example 1.4
Hand simulation of the procedure call “interpath(4, 2)”:
(see the tree of Figure 1.20).
The call interpath(4, 2) results in two calls: interpath(4, 5) and inter-
path(5, 2).
The call interpath(4, 5) leads to two calls interpath(4, 3) and inter-
path(3, 5). The calls interpath(4, 3) and interpath(3, 5) lead to no calls,
since INTER[4, 3] = INTER[3, 5] = 0. This results in printing of the
vertices : 3, 5
Now interpath(5, 2) calls interpath(5, 1) and interpath(1, 2).
The calls interpath(5, 1) and interpath(1, 2) lead to no calls, since
INTER[5,1] = INTER[1, 2] = 0. This leads to printing of the vertex: 1.
Hence, the intermediate vertices of a shortest path from the vertex
4 to the vertex 2 are 3, 5, 1.
Graph Algorithms I 43
1 2 3 4 5
e 5 9 7 4 3
The eccentricities of different vertices of the graph of Figure 1.21 are given
by the Table 1.7
Example 1.5:
Radius, Diameter and Center of graph of Figure 1.21.
The graph is represented by its weighted adjacency matrix W where
1 2 3 4 5
⎛ ⎞
1 0 2 ∞ 3 ∞
2⎜
⎜∞ 0 3 ∞ ∞⎟⎟
W = 3⎜
⎜∞ ∞ 0 3 ∞⎟⎟
4⎝∞ ∞ ∞ 0 2 ⎠
5 1 2 2 3 0
Let us now write an algorithm to the radius, diameter and the center of a
graph.
(* find e[i] *)
e[i] := M[i, 1];(* initialization of e[i] *)
(* iteration *)
for j := 2 to n do
if e[i] < M[i,j] then
e[i] := M[i,j];
end;
(* We have found the eccentricity table e *)
(* Initializing r and d *)
r := e[1]; d := e[1];
(* iteration *)
for i:= 2 to n do
begin
if r > e[i] then
r := e[i];
if d < e[i] then
d := e[i];
end;
(* finding the center C*)
(* initialization of C*)
for i: = 1 to n do
C[i] := 0; (* C is empty *)
(* iteration *)
for i := 1 to n do
if r = e[i] then
C[i] := 1;
(* print center *)
(*C[i] = 1 if i in C, 0 otherwise *)
for i := 1 to n do
if C[i] =1 then write (i, ’ ’);
Example 1.6
Transtive closure of a graph: Consider the graph G of Figure 1.22.
This graph is equivalently represented by its adjacency matrix M
where
1 2 3 4 5
⎛ ⎞
1 0 1 0 0 0
2⎜⎜0 0 1 1 0⎟
⎟
M = 3⎜ ⎜ 0 0 0 0 0 ⎟
⎟
4⎝0 1 0 0 1⎠
5 1 0 0 0 0
Graph Algorithms I 47
(n)
Interpretation of Mn = (mij ):
(n)
By definition, mij is equal to 1 if there is a directed path from i to j not
going through any intermediate vertex > n and is equal to 0, otherwise. Since
(n)
no vertex of the graph is > n, this means that mij is equal to 1 if there is a
directed path from i to j in G and is equal to 0, otherwise.
This means that Mn is the desired output matrix T M . The subscript k in
Mk represents the iteration number.
Induction leap:
How to find the matrix Mk given the matrix Mk−1 ?
(k−1)
We are given the n2 entries mij of the matrix Mk−1 . We have to express
(k) (k−1) (k−1)
mij in terms of mik and mkj . This is because during the construction
of Mk we have the right to use the vertex k as an intermediate vertex which
is not the case with the matrix Mk−1 .
We must not disturb the entries “one” of the matrix Mk−1 , since a directed
path from i to j not going through any intermediate vertex > k − 1 is also a
directed path not passing through any intermediate vertex > k. We are only
interested in a possible reassignment of zero coefficients of the matrix Mk−1
into 1. Since the concatenation of a path from i to k not going through any
intermediate vertex > k − 1 and a path from k to j not passing through any
intermediate vertex > k − 1 is a path from the vertex i to the vertex j not
passing through any intermediate vertex > k, we have the following formula
(k)
for mij .
(k) (k−1) (k−1) (k−1)
mij = max mij , mik × mkj , for 1 ≤ i, j ≤ n.
(k) (k−1) (k−1)
By setting j = k in the above formula, we have mik = max(mik , mik ×
(k−1) (k−1) (k) (k−1)
mkk ) = mik and similarly by setting i = k we get, mkj = mkj .
Hence, we can carry out our computation with only one copy of the
matrix M . Let us describe the above formula in words:
There is a directed path not going through any intermediate vertex > k if:
1. There is already a path from the vertex i to j not going through any
intermediate vertex > k − 1 (this is equivalent to saying: we must not
modify the entries “one” of the matrix Mk−1 ) or
2. There is a path from i to k not passing through any intermediate ver-
tex > k − 1 and a path from k to j not going through any intermediate
vertex > k − 1.
We shall now write Warshall’s algorithm. We use the type “Matrix” to repre-
sent n × n matrices of 0 and 1.
Example 1.7
Execution of Warshall’s algorithm: Consider the graph which is an
elementary circuit of length 4 with vertex set {1, 2, 3, 4} and the arc set
{(1, 2), (2, 3), (3, 4), (4, 1)}. The adjacency matrix of the above graph is
1 2 3 4
⎛ ⎞
1 0 1 0 0
2⎜0 0 1 0⎟
M= ⎜ ⎟.
3⎝0 0 0 1⎠
4 1 0 0 0
1 2 3 4
⎛ ⎞
1 0 1 1 0
2⎜0 0 1 0⎟
TM = ⎜ ⎟
3⎝0 0 0 1⎠
4 1 1 1 0
1 2 3 4
⎛ ⎞
1 0 1 1 1
2⎜0 0 1 1⎟
TM = ⎜ ⎟
3⎝0 0 0 1⎠
4 1 1 1 0
1 2 3 4
⎛ ⎞
1 1 1 1 1
2⎜1 1 1 1⎟
TM = ⎜ ⎟
3⎝1 1 1 1⎠
4 1 1 1 1
Depth-first search:
The depth-first search of a graph can be viewed as a generalization of preorder
traversal of a tree. If the graph does not possess a cycle, then the depth-first
search (dfs) of a graph coincides with the preorder traversal of a tree.
Let us first explain the dfs in an intuitive manner. Consider a connected
multigraph G. We start with a vertex, called the starting vertex, which will
become the root of a spanning tree to be generated by dfs. From the starting
Graph Algorithms I 51
Example 1.8
dfs: Consider the following simple graph G of Figure 1.24 consisting
of two connected components.
The graph G is represented by the adjacency lists. In each adjacency
list, the vertices are listed in increasing order. We use an array “mark”
the vertices processed by the dfs. Initially, for each vertex i, we have
mark[i] = 0. We set mark[i] ← 1 as soon as the vertex i is reached
during the dfs. (0 means “unprocessed” or “unvisited” and 1 means
“processed” or “visited.”)
L(1) = (4, 6); L(2) = (5); L(3) = (6, 7); L(4) = (1, 5, 6); L(5) =
(2, 4, 6); L(6) = (1, 3, 4, 5, 7); L(8) = (9, 10), L(9) = (8, 10); L(10) =
(8, 9). Now let us take a vertex, say, the vertex 1, as the starting vertex
and set mark[1] ← 1. The adjacency list of 1, L(1) is scanned to find
a vertex marked 0. The very first vertex 4 in L(1) is chosen and we set
mark[4] ← 1. The traversing of the list L(1) is temporarily suspended
52 Discrete Mathematics
and we start scanning the list L(4) to find a vertex “unprocessed.” The
vertex 5 is found and we set mark[5] ← 1. Then, again the scanning
of the list L(4) is temporarily suspended and we start scanning the list
L(5) and we find the vertex 2 and we assign mark[2] ← 1. We now scan
the list L(2) in search of a unprocessed vertex. No new vertices are found
by scanning L(2) and we climb up the tree to find the vertex 5 which is
the “dad” of vertex 2, that is, dad[2] ← 5. We now restart the scanning
of the list L(5), which was suspended previously. We find the vertex 6,
then the vertex 3 and the vertex 7. Now all the vertices of the connected
component containing the vertex 1 have been “processed.” We now con-
sider the second connected component and “visit” each vertex according
to dfs. (See the forest F of Figure 1.25. The forest F is defined by the
solid edges.)
The edges of the forest F are drawn as continuous edges, and the
edges of the graph G not in the forest are depicted as dotted edges.
The children/sons/daughters of a vertex are drawn from left to right. A
vertex j is called a descendant of a vertex i, if there is a downward path
from the vertex i to the vertex j in a tree of the forest generated by dfs.
In this case, we also say that the vertex i is an ancestor of the vertex
j. We consider a vertex x as an ancestor and descendant of itself. A
descendant or ancestor of a vertex x other than itself is called a proper
descendant or proper ancestor of x.
We associate to each vertex i of the graph, called the depth-first
search number(dfsn), where df sn[i] = j if and only if i is the jth vertex
visited during the dfs. Table 1.8 gives the dfsn of each vertex of the
above forest F.
1 2 3 4 5 6 7 8 9 10
dfsn 1 4 6 2 3 5 7 8 9 10
dfs algorithm:
(* initialization *)
for i := 1 to n do
mark[i] := 0;
F : = empty;(* F, the forest under construction*)
counter : = 1; (* counter is used to assign the dfs number *)
(* end of initialization *)
procedure dfs( i : integer);
(* dfs visits each vertex of a connected component of G
containing i using depth-first search *)
var j : integer;(* vertex j scans L(i) *)
begin
mark[i] := 1;
dfsn[i] := counter;
counter := counter + 1;
(* scan L(i) *)
for each vertex j in L(i) do
if mark[j] = 0 then
begin
add the edge ij to F;
dfs(j); (* recursive call *)
end;
(* iteration *)
for i := 1 to n do
if mark[i] = 0 then
dfs(i); (* invoking dfs *)
54 Discrete Mathematics
The edges of the graph G which are not in the spanning forest induced
by dfs are the back edges. If the graph is connected, then the dfs induces a
spanning tree, called the dfs spanning tree.
Let xy be an edge of the dfs spanning forest, that is, forest induced by
dfs, with y, a son of x. Then, dfs(x) called dfs(y) directly during the dfs. In
other words, the vertex y is marked zero, when we scan the list L(x), the list
of vertices adjacent to x.
We distinguish two types of calls:
1. Direct calls
2. Indirect calls
dfs(x) called directly dfs(y) if and only if the edge xy is an edge of the spanning
forest constructed by dfs, that is, x is the father/mother of y in the spanning
forest.
Note also that when we are at vertex x, scanning the list L(x), and z
in L(x), that is, xz is an edge (and hence, x is also in the list L(z)), with
mark[z] = 1, we cannot simply put the edge xz in the set of back edges,
because the vertex z can be the father/parent of the vertex x.
If xy is a back edge, then neither dfs(x) nor dfs(y) called the other directly,
but one called the other indirectly, that is, dfs(y) called dfs(z), which in turn
called dfs(x), and hence y is an ancestor of x.
Remark 1.2. The dfs spanning forest need not be unique. A dfs forest gen-
erated by the search procedure dfs depends on the starting vertices, that is, the
roots of each tree in the spanning forest and the order of the vertices in each
list L(i).
Remark 1.3. The order in which the different calls terminate in the dfs is
the postorder traversal of the dfs tree.
electrical circuit, the question of whether the circuit is connected is not at all
obvious “visually.”
Our problem is to instruct a computer so that it writes different connected
components of a multigraph. The algorithm can be easily written using the
dfs. We use an integer variable “nc” to calculate the number of connected
components of G. Then, we add the following statement in the dfs algorithm.
After the assignment mark[i] := 0 at the beginning of the “procedure dfs,”
add the following print statement: write(’ ’,i); which means that as soon as we
“touch” or “visit” a vertex i, we immediately print the corresponding vertex.
Add the statement nc := 0; in the initialization part of the algorithm.
Finally we rewrite the “iteration part” of the dfs algorithm as follows:
for i := 1 to n do
if mark[i] = 0 then
begin
nc := nc + 1;(* update nc *)
dfs(i); (* call to dfs. i is a root of a sub-tree
in dfs spanning forest which is under
construction *)
writeln;(* print a new line character to separate
components*)
end;
writeln(nc);(* print the number of components *)
Justification of the algorithm: Each call dfs(i) in the “iteration part” corre-
sponds to the root i of a subtree in the dfs spanning forest. Once the call dfs(i)
is initiated, all vertices connected to the vertex i by a “downward” path in
dfs forest are marked with 1. Hence, the validity of the algorithm.
time in the course of the dfs search if we are at vertex i and df sn[i] ≥ df sn[j].
Of course, the same back edge ij = ji is traversed the second time when we
are again at vertex j. A loop is considered as a back edge. Let us now write
the algorithm. We shall use an array “dad” where dad[j] = i if the vertex i is
the father/parent of the vertex j in the spanning dfs forest.
In the following graph of Figure 1.26, the vertices 3, 4, and 6 are cut ver-
tices and the bridges are edges 47 and 36. Note that an end vertex of a bridge
is a cut vertex if its degree is strictly more than one.
Graph Algorithms I 59
3. 36, 37, 67
4. 59, 58, 89
60 Discrete Mathematics
The above two properties are in general true for any connected graph with
at least two vertices. Intuitively, this means that the different cut vertices of
a graph are the “frontiers” between the biconnected components.
We shall now describe the biconnected components algorithm.
Input: A connected multigraph G = (X, E) with X = {1, 2, . . . , n}.
Output: A list of edges of each biconnected component of the graph G.
Algorithm: We shall first illustrate the algorithm on the graph of Figure 1.27.
Step 1: The graph is represented by its adjacency lists. Let us assume that
the vertices are listed in the increasing order in each list L(i), that is, L(1) =
(2, 4, 9), L(2) = (1, 3, 4, 9), etc., till L(9) = (1, 2, 4, 5, 8). We choose a starting
vertex, say, the vertex 1.
Step 2: The graph is searched according to the algorithm dfs. This search
gives us a partition of the edge set into T and B, where T is the set of edges
of the spanning tree induced by the dfs and B is the set of back edges which
is equal to E \ T. During the search, each vertex is assigned a number called
the dfs number. Recall that dfsn[i] = j if and only if the vertex i is the jth
vertex “visited” in the course of the dfs (see the graph of Figure 1.28 below
obtained by dfs).
The following Table 1.9 gives us the dfsn of different vertices.
Graph Algorithms I 61
1 2 3 4 5 6 7 8 9
dfsn 1 2 3 6 8 4 5 9 7
LOW[2] = min(df sn[2], df sn[1], df sn[1], df sn[2]) = 1 because of the back edges
41, 91, 92.
Note that a vertex is considered as a descendant of itself. The following table
gives the LOW function of different vertices of our example graph (Table 1.10).
As we have already remarked, the different cut vertices of the graph G serve
as “frontiers” between biconnected components. So we are first interested in
characterizing the cut vertices of the graph G. Let us again refer to the graph
of Figure 1.28. The vertex 3 is a cut vertex, whereas the vertex 6 is not. Why
is vertex 6 not a cut vertex? This is because of the following fact: There is a
back edge, namely, 73, from the son of 6, that is, from a proper descendant
of 6, to a proper ancestor of 6. This back edge gives us a second path from
3 to 7 and the first path from 3 to 7 is obtained via the tree edges. On the
other hand, there is no back edge from a proper descendant of the vertex 3
to a proper ancestor of 6. Hence, there are no two vertex-disjoint paths from
a proper descendant of 3 to a proper ancestor of 3. This makes the vertex 3,
a cut vertex of the graph G. In a similar manner we can conclude that the
vertices 2 and 9 are cut vertices.
We now state the following theorem whose proof is intuitively clear from
the above discussion. For a formal proof, the reader can consult the book [3].
Theorem 1.3. Consider a connected multigraph G and consider a partition
of the edge set of the graph G into T , a set of tree edges and B, a set of back
edges induced by a search dfs. Then, a vertex i is a cut vertex of G if and only
if it satisfies exactly one of the following properties:
1. i is the root of the tree T and the root i has at least two sons in the
spanning dfs tree T.
2. i is not the root and i possesses at least one son s in the spanning dfs
tree with LOW[s] ≥ df sn[i].
1 2 3 4 5 6 7 8 9
LOW 1 1 3 1 7 3 3 7 1
Graph Algorithms I 63
In our example, the stack will look as below when we are at vertex 7 before
backtracking from 7 to the father of 7 which is 6 (the stack grows from left
right).
S = (12, 23).
We are at vertex 2 and no new vertices are found from the vertex 2. Hence
we climb up the tree from 2 to his father 1. The test LOW [2] ≥ df sn[1] turns
out to be true and we pop up the stack till the edge 23.
The second biconnected component found is
B2 = 23
(* initialization *)
for i := 1 to n do
mark[i] := 0;
counter := 1:
S := empty;(* empty stack *)
dfs_biconn(1); (* procedure call *)
begin
write(t^.info,’ ’);
t := t^.next;
end;
writeln;
end;
end;(* output_graph *)
procedure dfs_biconnected;
var t : pointer; j:integer;
dfsn : array[1..maxn] of integer;
LOW : array[1..maxn ] of integer;
dad : array[1..maxn ] of integer;
begin
mark[i] := 1;
dfsn[i] := counter; counter := counter + 1;
(* initialization of LOW: part 1 *)
LOW[i] := dfsn[i];
(* scan L[i] *)
t := L[i];
while t <> nil do
begin
j := t^.info;
if mark[j] = 0
then
begin
dad[t^.info] := i;(* add (i,j) to dfs tree *)
(* push (i, j) into the stack *)
S[top+1] := i;
S[top+2] := j;
top := top +2; (* update top *)
dfs_biconnected(j);(* recursive call *)
(* at this point we have found LOW[j] *)
(* update LOW[i]: part 2 *)
if LOW[i] > LOW[j] then
LOW[i] := LOW[j];
if LOW[j] >= dfsn[i] then
begin
(* a biconnected component is found *)
(* pop S till the edge ij *)
repeat
write(’(’,S[top],’,’,S[top-1],’)’);
top := top - 2;
until S[top+1] = i;
writeln;
68 Discrete Mathematics
end;
end(* then*)
else
(* test if ij is a back edge *)
if (dad[i] <> j) and (dfsn[i] > dfsn[j])
then
begin
S[top + 1] := i;
S[top + 2] := j;
top := top + 2;
(* update LOW[i]: part 3 *)
if LOW[i] > dfsn[j] then
LOW[i] := dfsn[j];
end;
t := t^.next;
end;(* while*)
end;(* dfs_biconnected *)
vertex i will not be in the list succ[j]. In the case of an undirected graph, if
ij is an edge, then we include the vertex j in the list L[i] and also include the
vertex i in the list L[j].
The search dfs for a directed graph is illustrated by the following example.
Consider the graph of Figure 1.29.
As in the case of an undirected graph, the graph is represented by list of
successors of each vertex.
succ[1] = (2, 5, 6); succ[2] = (3, 4); succ[3] = (1); succ[4] = (3); succ[5] =
(6, 7); succ[6] = (); succ[7] = (6); succ[8] = (5, 9, 10); succ[9] = (); succ[10] =
(8, 9).
Let us now perform the dfs on the graph of Figure 1.30. The first call dfs(1)
“visits” the vertices 1, 2, . . . , 7 and gives us a tree with the root 1. Then, the
second call dfs(8) “visits” the remaining vertices 8, 9, 10 and gives us a second
tree of the dfs spanning forest. The original graph is restructured as in the
following graph Figure 1.30. In this figure, the solid arcs define a spanning
arborescence.
We observe the following properties:
1. The tree arcs which are drawn as continuous arcs. If (i, j) is an arc
in the dfs spanning forest induced by the algorithm, then df s(i) called
directly df s(j) during the search (e.g., (2, 4)).
70 Discrete Mathematics
4. Dotted arcs like (4, 3) and (8, 5) which join two vertices that are not in
the relation ancestor-descendant or descendant-ancestor. Such arcs are
called dotted cross arcs.
We shall now observe and prove the following key property of a dotted
cross arc.
Property 1.4. A dotted cross arc always goes from the right to the left. (We
assume that the children of each vertex are drawn from left to right and the
different trees of the dfs spanning forest are also drawn in the left to right
fashion.) In other words, if (i, j) is a dotted cross arc, then df sn[i] > df sn[j].
Proof. Consider a dotted cross arc (i, j). This means that the vertex j is
“visited” before the vertex i is “visited” during the search. Otherwise, suppose
the vertex i is “visited” before the vertex j. That is, when we reach the vertex i
for the first time, the vertex j is still marked “unvisited.” Since (i, j) is an edge
of the graph, we have j in the list of succ[i]. But then the “visit” of i would
not be complete unless we touch the vertex j. Hence, j must be a descendant
of i in the dfs spanning forest. Since (i, j) is a dotted edge, we conclude that
(i, j) must be a descending dotted edge, a contradiction.
Hence j is “visited” before the vertex i. This means that when we reach i for
the first time, j is marked “visited.” Hence, df sn[i] > df sn[j]. (Geometrically,
the vertex i is to the right of the vertex j.)
(* initialization *)
for i := 1 to n do
mark[i] := 0;
counter := 1;
F := empty; (* empty forest *)
(* iteration *)
for i := 1 to n do
begin (* test if i is a root *)
dfs(i): (* call to the procedure dfs *)
if counter = n+1
then write(i,’ is a root’);
(* reinitialize: mark all vertices ‘‘unvisited’’ *)
for i := 1 to n do
mark[i] := 0;
counter := 1;
end;
Example 1.10
Consider the following graph of Figure 1.32 without circuits.
Let us assign the integer 1 to the vertex e, 2 to a, 3 to d, 4 to b and
finally 5 to the vertex c. This assignment defines a topological sorting
of the vertices of the graph. The same graph is redrawn below in Figure
1.33 to reflect the property of “topological sorting.”
Theorem 1.4. A directed graph G admits a topological sort if and only if the
graph is without circuits.
Proof. One part of the theorem is obvious. If the graph admits a topological
sort, then by definition we can assign the integers 1, 2, . . . , n as new labels to
its vertices in such a way that if (i, j) is an arc of the graph, then i < j. If
the graph possesses a circuit (v1 , v2 , . . . , vs , v1 ), then because of the property
of the new labeling of the vertices, we have v1 < v2 < · · · < vs . But then,
(vs , v1 ) is an arc of the circuit with v1 < vs , a contradiction.
Now for the second part, consider a longest directed path (x0 , x1 , . . . , xd )
in the given graph. Its length is d. We shall first prove that the initial vertex
x0 of our longest path satisfies d− (x0 ) = 0. If not, there is an arc (x, x0 ) in
the graph. We shall distinguish two cases.
Case 1. x = xi for 1 ≤ i ≤ d.
We then have a directed path (x, x0 , x1 , . . . , xd ) of length d +1, a con-
tradiction.
Topological sort1:
(* initialization of counter *)
counter := 0;
while counter < n do
begin
(* scan the array id to find vertices of in-degree zero *)
for i := 1 to n do
if id[i] = 0 then
begin
write(i,’ ’);
counter := counter + 1;
id[i] := -1; (* vertex i is processed *)
(* scan succ[i] to reduce the in-degree by one *)
for each j in succ[i] do
id[j] := id[j] - 1;
end;
end;
Algorithm 1.5. This algorithm uses dfs. It is based on the following property:
A directed graph is without circuits if and only if there is no dotted mounting
arc during the dfs. A simple write statement at the end of the dfs procedure
gives us a reverse topological sort. We use a stack S of vertices. We will give
the procedure below:
(* iteration *)
for i := 1 to n do
if mark[i] = 0 then
dfs(i); (* invoking toposort *)
(* print the vertices in topological sort *)
for i: = n downto 1 do
write (S[i],’ ’);
completed first before the foundation work starts, and plumbing and electric-
ity may be taken later once the walls and ceilings will have been completed,
etc. These subprojects are called the activities of tasks; that is, a project is
constituted of activities. These activities are interrelated heavily by the con-
straints like a certain activity may not start until other activities have been
completed.
Let us illustrate this by an example.
Example 1.11
The vertices of the graph (see Figure 1.34) without circuits repre-
sent different activities and the graph itself represents a project. An arc
from the vertex i to the vertex j means the activity i must have been
completed before initiating the activity j. The weight w(i, j) associated
with an arc (i, j) represents the time needed (e.g., measured in weeks) to
finish the task i. In our example, the task 1 has no predecessor and the
task 9 has no successor. Such vertices always exist in a directed graph
without circuits. For example, the first vertex in a topological sort has
no predecessor, and the last vertex of the sort has no successor. We
introduce two new vertices s and e to the activity digraph. The vertex
s corresponds to the starting of the project and the vertex e to the end
of the entire project. The vertex s is joined to all vertices of the graph
for which the in-degree is 0 and all vertices of the graph for which the
out-degree is 0 are all joined to the vertex e. Finally, we associate the
weight to all arcs of the form (s, i) and all arcs of the form (j, e). In our
example, there is only one vertex of in-degree 0, namely the vertex 1
and the only vertex of out-degree 0 is 9.
We are interested in finding the minimum number of weeks needed to
finish the entire project. This problem is equivalent to finding a directed
path of maximum weight in a graph without circuits. In our example, a
directed path of maximum weight is (1, 2, 3, 4, 7, 9) and its weight is 38.
Note that task 2 can start only after completing task 1, and 3 can be
initiated after finishing 3, etc. This means that to complete the entire
project we need a minimum of 38 weeks. Why? The time required to
complete the entire project cannot be less than 38 weeks, since otherwise
not all of the tasks of the graph could be completed.
78 Discrete Mathematics
To see this, let us write the vertices of the activity digraph in topolog-
ical order. The desired order is (1, 2, 3, 4, 5, 6, 7, 8, 9). We first start and
finish the task 1; then we start and finish the task 2 and then 3. Now we
can start simultaneously the tasks 4, 5 and 6 since these three activities
are not related by the constraints of precedence (technically these tasks
4, 5, and 6 form a stable set, that is, there are no arcs joining these three
vertices in the graph. After finishing 4, we can initiate 7. Note that after
finishing 5, we cannot immediately start the task 7, since we have to wait
for the completion of the task 4 which takes longer than the task 5. (The
task 7 has two predecessors: 4 and 5.) After finishing 6, we start 8. The
task 7 starts before the task 8 but is completed only after the task 8,
because the weight of a longest path to 7 is 28, whereas the one to 8 is 31.
That is, 7 starts two weeks later than 8 but is completed one week earlier.
A directed path of maximum weight is called a critical path. The
different tasks on a longest weighted path are called critical tasks. In
our example, the critical tasks are 1, 2, 3, 4, 7, 9. If the completion of one
of the tasks on a critical path is delayed, say, by one week, then the entire
project will be delayed by one week. Techniques referred to as CPM and
PERT use weighted directed graphs without circuits as models.
for i := 1 to n do
begin
for each vertex j in the list pred(i) do
if t[j] + w(j,i) > t[i] then
t[i] := t[j] + w(j,i);
end;
Example 1.13
Consider the graph of Figure 1.35.
The graph of Figure 1.35 is not strongly connected because there is
no directed path from the vertex 1 to the vertex 6 even though there is a
directed path from the vertex 6 to the vertex 1. The graph possesses three
strongly connected components. These strongly connected components
are induced by the vertex sets {1, 2, 3, 4, 5} and {6, 7, 8} and {9}.
Note that every vertex of the graph G lies in exactly one strongly
connected component, but there are edges like (6, 5) and (7, 9) which are
not in any strongly connected components. In other words, the different
strongly connected components of a graph define a partition of the vertex
set of X of G into a union of mutually disjoint subsets of X.
a function LOWLINK using dfs spanning forest, dfsn, dotted mounting arcs
and dotted cross arcs.
In the biconnected components algorithm, we have pushed and popped the
edges of the undirected connected graph, because the different biconnected
components form a partition of the edge set. In the strongly connected com-
ponents algorithm, we will be pushing and popping the vertices of the graph
because the different strong components of a graph form a partition of the
vertex set.
In short, the strong components algorithm and the biconnected compo-
nents are “dual” algorithms.
Lemierre (1733–1793)
1 2 3 4 5 6 7 8 9 10
dfsn 1 2 3 4 5 6 7 8 9 10
LOWLINK[i] is the smallest dfsn ≤ df s[i] one can reach, either from the
vertex i or from one of the descendants of i in the dfs spanning forest, using
only one dotted mounting or cross arc. In the case of utilization of a dotted
cross arc in the calculation of LOWLINK, this dotted cross arc must lie in
some circuit of the graph. If such dotted arcs lead to vertices whose depth-first
search number > df sn[i] or if no such dotted arcs exist from i or from one of
its descendants, then we define LOWLINK[i] = df sn[i].
Mathematically, we write LOWLINK[i] = min({df sn[i]} ∪ B), where
B = {df sn[j] | ∃ an arc ∈ M ∪ C lying in a circuit from a
descendant of i to j},
where M is the set of mounting arcs and C is the set of cross arcs. Let
us calculate the function LOWLINK in our example. In the algorithm, the
different LOWLINKS are computed according to the postorder traversal of
the vertices of the spanning dfs forest. The postorder of our dfs forest is
(3, 4, 2, 1, 7, 8, 6, 10, 9, 5). Hence the first vertex for which LOWLINK is com-
puted is the vertex 3.
LOWLINK[3] is min(df sn[3], df sn[2]) = min(3, 2) = 2, because of
the mounting dotted edge (3, 2) from the vertex 3. LOWLINK[4] is
min(df sn[4], df sn[3]) = min(4, 3) = 3 because of the dotted cross edge (4, 3)
from the vertex 4 and this arc lies in a circuit (4, 3, 2, 4). LOWLINK[2] =
min(df sn[2], df sn[2]) = 2 because of the dotted mounting arc from the son 3
of the vertex 2. LOWLINK[1] = 1. LOWLINK[7] is min(df sn[7], df sn[5] =
min(7, 5) = 5. Note that the dotted cross arc (7, 4) from the vertex 7 is not
taken into account for the calculation of LOWLINK[7] because this cross arc
does not lie in any circuit of the given graph. Similarly, while computing
LOWLINK[5], the dotted cross arc (5, 1) will not be considered because of
the absence of a circuit passing through this arc. On the other hand, while
computing LOWLINK[8], we have to take into consideration the dotted cross
arc (8, 7) because this arc lies in a circuit (8, 7, 5, 6, 8) of the given graph.
Graph Algorithms I 83
1 2 3 4 5 6 7 8 9 10
LOWLINK 1 2 2 3 5 5 5 7 9 9
S = (1, 2, 3).
Now we test if the vertex 3 is a root of a strong component. The answer is
“no.” So no pop-up occurs. From the vertex 2, we find a new vertex 4 and
84 Discrete Mathematics
hence we push into the stack the vertex 4. No new vertices are found from the
vertex 4 and we are ready to backtrack from 4 to his father 2. At this point
the stack will look as
S = (1, 2, 3, 4).
We perform the test: Is 4 a root of a strong component? Since this is not the
case, no pop-up occurs. We climb up to the vertex 2. From 2, no new vertices
are found and we perform the test if 2 is a root of a strong component. Since
the answer to the test is “yes,” we pop up the stack S till the vertex 2. This
is the vertex set of the first strong component found by the algorithm.
The first strong component is induced by the set {4, 3, 2}.
We go back to the vertex 1. Since no new vertices are found from 1, we
perform the test if 1 is a root of a strong component. Since 1 is a root, we pop
the stack till the vertex 1.
The second strong component found is induced by the set {1}.
At this point the stack is again empty. The dfs takes a new vertex 5 as a
root of a tree in the spanning forest under construction and initiates the dfs
search from this vertex 5. From 5 we find 6 and from 6 we find the vertex 7. No
new vertices are found from 7. So we backtrack. At this point, the stack will be
S = (5, 6, 7).
Test fails for 7 and we move to 6. From 6, we find 8. No new vertices are
“visited” from 8. We prepare to climb up the tree from 8 to the father of 8
which is 6. At this point, the stack will be S = (5, 6, 7, 8). 8 is not the root.
From 6 no new vertices are “visited.” Since 6 is not a root, we reach the
vertex 5. From 5, we find the new vertex 9, from 9 we find the vertex 10. At
this point, the stack is S = (5, 6, 7, 8, 9, 10). 10 is not a root and we reach back
to 9. No new vertices are found from 9. 9 is a root and we pop up the stack
till the vertex 9.
Hence the third strong component is induced by the set {10, 9}.
The stack S = (5, 6, 7, 8). Finally, we climb up to the vertex 5. Since no
new vertices are found from 5, it is a root. Hence we pop up the stack till the
vertex 5.
Hence the fourth and final strong component is induced by the set
{8, 7, 6, 5}.
Note that the order in which the strong components are found is the same
as that of the termination of their dfs calls. In our example, this order is
2, 1, 9, 5. The call dfs(2) ends before the call dfs(1) ends. The call dfs(1) ends
before the call dfs(9) ends. Finally, the call dfsn(9) ends before the call dfs(5)
ends.
It remains to recognize the different vertices which are the roots of strong
components. From the example, we observe the following property of the roots
of strong components.
Graph Algorithms I 85
where
part3
C = {df sn[j] | ∃ an arc ∈ M ∪ C lying in a circuit from a descendant of i to j},
where M is the set of mounting arcs and C is the set of cross arcs. We are
now ready to describe the algorithm in pseudo-code.
We use the following variables: The array “mark” (mark[i] = 0 if the ver-
tex i is “unvisited” and mark[i] = 1 if the vertex i has been “visited during
the dfs). The array “dfsn” where dfsn[i] is the depth-first search number of
the vertex i and the array “LOWLINK” to compute the LOWLINK of each
vertex. The variable “counter” is used to assign the dfsn. The stack S elements
are the vertices of the graph.
else(* j is "visited" *)
(* test if arc (j,i) is a mounting or cross arc *)
if (dfsn[i] > dfsn[j]) and ((j,i) in a circuit)
then
(* update LOWLINK: part 3 of LOWLINK definition *)
LOWLINK[i] := min(LOWLINK[i],dfsn[j]);
(* test if the vertex i is a root of a strong component *)
(* if so pop up the stack till the vertex i *)
if LOWLINK[i] = dfsn[i]
then
repeat
write(S[top],’ ’);
top := top - 1;
until S[top+1] = i;
writeln;(*new line to separate strong components *)
end;
(* initialization *)
for i := 1 to n do
mark[i] := 0;
top := 0; (* empty stack *)
counter := 1; (* to assign dfs number *)
for i := 1 to n do
if mark[i] = 0
then dfs_strong_comp(i); (* call to procedure dfs_
strong_comp*)
On the other hand, the cross arc (5, 1) is not taken into consideration in
the computation of LOWLINK[5], since the end vertex 1 of this cross arc does
not belong to the stack S. Note that the strong component containing the
vertex 1 has already been emitted by the algorithm when we are at vertex 5
examining the cross arc (5, 1). For a similar reason, we discard the cross arc
(7, 4) in the computation of LOWLINK[7].
var t : pointer;
i : integer;
begin
for i := 1 to n do
begin
t := succ[i];
if t = nil
then
write(’no successor to ’, i)
else
begin
write(’the successor to ’,i, ’ are :’);
(* scan the list succ[i] *)
while t <> nil do
begin
write(t^.info,’ ’);
t := t^.next;
end;
writeln;
end;
end;(* output_graph *)
procedure dfs_strong_comp( i : integer);
var t : pointer; j : integer;
dfsn : array[1..maxn] of integer;
LOWLINK : array[1..maxn] of integer;
instack : array[1..maxn] of boolean;
(* instack[i] is true if i is in stack, false otherwise *)
begin
mark[i] := 1;
dfsn[i] := counter; counter := counter + 1;
(* push i into stack *)
top := top + 1; S[top] := i;
isstack[i] := true;
(* initialization of LOWLINK[i] using part 1 *)
LOWLINK[i] := dfsn[i];
(* scan the list succ[i] *)
t := succ[i];
while t < > nil do
begin
j := t^.info;
if mark[j] = 0
then (* j is a new vertex *)
begin
dfs_strong_comp(j); (* recursive call *)
(* at this point we have computed LOWLINK[j]*)
(* update LOWLINK[i] using part 2*)
if LOWLINK[i] > LOWLINK[j]
then LOWLINK[i] := LOWLINK[j];
Graph Algorithms I 89
end
else
if (isstack[j] = true) and (dfsn[i] > dfsn[j])
then (* (i,j) is cross arc in a circuit *)
(* update LOWLINK[i]: part 3 *)
if LOWLINK[i] > dfsn[j]
then LOWLINK[i] := dfsn[j];
t := t^.next;
end;
(* test if i is a root of a strong component *)
if LOWLINK[i] = dfsn[i]
then(* i is a root of a strong component *)
begin
repeat
write(S(top],’ ’);
isstack[S[top]] := false;
top := top - 1;
until S[top +1] = i;
writeln;(* new line *)
end;
end;
begin (* strongcomp *)
(* initialization *)
for i := 1 to n do
begin
mark[i] := 0;
isstack[i] := false;
end;
top := 0; (* empty stack *)
counter := 1;
(* end of initialization *)
input_graph;
output_graph;
for i := 1 to n do
if mark[i] = 0
then dfs_strong_comp(i); (* call *)
end.
Remark 1.6. We have already said that the biconnected components algo-
rithm and the strongly connected components algorithm are dual algorithms.
This duality actually comes from the following observation:
90 Discrete Mathematics
Output: Find a Hamiltonian cycle of the complete graph Kn with the sum of
the weights of the edges of the cycle as a minimum, that is, find an elementary
cycle passing through each vertex exactly once with the sum of the weights of
the edges as a minimum.
Let us illustrate this problem with an example (see graph of Figure 1.38).
In the graph of Figure 1.38, the different tours and their corresponding costs
are given in Table 1.13. In Table 1.13, there are three tours and the minimum
tour is the tour number 2 whose cost is 19. Note that the tours (1, 2, 3, 4, 1)
and the reverse tour (1, 4, 3, 2, 1) give the same cost. Hence such duplicated
tours are not listed in the table. The following lemma gives the number of
Hamiltonian tours in a complete graph Kn .
Lemma 1.2. The number of Hamiltonian tours in Kn is (n − 1)!/2 (by iden-
tifying the reverse tours, that is, for n = 3, (1, 2, 3, 1) = (1, 3, 2, 1), because
they have the same cost).
For a proof of this inequality, see [6]. The following program implements the
brute-force algorithm. This program does not check for equality between a
tour and its reverse tour. There are four cities 1, 2, 3 and 4. 0 is not a city.
The reader is requested to see the recursive permutation generation program
in [6] and appreciate the similarity with TSP brute-force program.
void tsp(int k)
{s=0;
for (i=1;i<N;i++)
s = s+c[v[i]][v[i+1]];
s = s+c[v[N]][v[1]];
if (s<mc)
{mc=s;
//save the Hamiltonian cycle
for (i=1;i<=N;i++)
t[i]=v[i];
}
}
else // recurrence
{ tsp(k-1); //recursive call
for (i=1;i<k;i++)
{//swap v[i] et v[k]
temp=v[i];
v[i]=v[k];
v[k]=temp;
Graph Algorithms I 93
int main ( )
{ int i ;
tsp ( N ) ; //call with k = N
//print the result
printf("the minimum cost tour is %d \n", mc );
//print the tour
for (i=1;i<=N;i++)
printf("%3d",t[i]);
printf("%3d",t[1]);
system ("pause");
return 0;
}
The complete tour obtained is (1, 3, 2, 4, 1), whose cost is 19, which is exact
in this case. We shall now present a C program to implement the NN algorithm
for TSP.
void tsp_nn( )
while (counter<N)
{
min=INT_MAX; //initialisation
for ( j=1;j<=N;j++)
if ((mark[j]==0)&&(i!=j)&&(c[i][j]<min))
{
min=c[i][j];
k=j;
}
mark[k]=1; //visit city k
counter=counter+1;
i=k;
t[counter]=k;
}//end while
} //end tsp_nn
int main ()
{
int i,mc=0;//mc, minimum cost tour
tsp_nn() ; //function call
for (i=1;i<N;i++)
cm=cm+c[t[i]][t[i+1]];
cm=cm+c[t[N]][t[1]];
MSTEULER(I)/E(I) < 2
Step 2. Doubling the edges of the graph of the spanning tree Figure 1.40 gives
us the graph of Figure 1.41.
MSTEULER(I)/E(I) < 2
the weights of the edges of T and w(C \ e) is the sum of the weights of the
edges of C \ e. By doubling each edge of T , we obtain the graph ET , with
w(ET ) = 2w(T ). Since a Eulerian cycle traverses each edge of ET exactly
once, its total weight is also w(ET ). Taking shortcuts will not increase the
cost w(ET ) as:
by triangle inequality, where c is the cost function associated with each edge.
Hence the theorem.
1.13 Exercises
1. Consider a directed graph G represented by its adjacency matrix. Write
a program in C to find the vertex-deleted subgraph G − k where k is
any vertex of the graph G. Find the complexity of your program.
10. Apply Dijkstra’s algorithm on the graph of Figure 1.43 by taking the
vertex 1 as the source vertex. The weights of arcs are indicated on each
directed edge. Give the array “dad.” Draw the spanning arborescence
obtained by the algorithm.
12. Execute Floyd’s algorithm on the graph of Figure 1.43. Give the matrix
IN T ER5×5 obtained at the end of the execution. Execute the call inter-
path(1,4) and write the vertices printed by the call.
13. Prove that in a directed graph, the number of directed walks of length k
from a vertex i to a vertex j is the (i, j) entry of the matrix M k where M
is the adjacency matrix of the graph. Using this result, design a O(n4 )
algorithm for transitive closure of a graph.
17. Execute Dijkstra’s negative cost algorithm on the following graph of five
vertices. The graph is represented by its 5 × 5 costs matrix W.
⎛ ⎞
0 7 8 9 10
⎜0 0 8 9 10 ⎟
⎜ ⎟
W =⎜ ⎜ 0 −2 0 9 10 ⎟ ⎟
⎝0 −4 −3 0 10 ⎠
0 −7 −6 −5 0
Can you guess the complexity of Dijkstra’s negative cost algorithm from
this example?
18. Execute the topological sort algorithm on the graph of Figure 1.44 after
assigning the following orientations to the edges of the graph:
Orient the edges as follows: (1, 2), (1, 3), (1, 5), (2, 3), (2, 5), (3, 5),(4, 3),
(5, 6), (6, 10), (10, 9), (5, 9), (3, 7), (7, 8), (3, 8).
D. E. Knuth
103
104 Discrete Mathematics
Observation 2.1. A dotted edge, which is not a loop, is always a cross edge,
that is, a dotted edge joins two vertices x, y where neither is an ancestor or
descendant of the other. This property is just the opposite of the property of
the dotted edges of the dfs of an undirected graph.
Graph Algorithms II 105
Every cross edge defines an elementary cycle as in the dfs. More precisely,
consider a cross edge xy, and let the vertex z be the closest common ancestor
of the vertices x and y in the bfs tree. Then, the unique path from z to x in
the tree and the unique path from z to y in the tree together with the cross
edge gives an elementary cycle of the graph.
For example, in Figure 2.1, the cross edge hi defines the cycle
(a, c, h, i, d, a), the common ancestor being the root vertex a.
Having seen an example, let us now describe the bfs algorithm in
pseudo-code.
Breadth-first or level-order search algorithm:
Input: An undirected graph G = (X, E) with X = { 1, 2, . . . , n }. The graph
is represented by the n lists L(i), i = 1, 2 . . . , n.
Output: A partition of the edge set E into the edge F , where F is a spanning
forest of G, with F being a vertex disjoint union of trees and the set D of
dotted edges.
Algorithm: An array mark[1..n] is used where mark[i] = 0 if and only if the
vertex i is not yet “visited” during the search. Initially, mark[i] := 0 for each
vertex i. We use a “queue of vertices” Q. Initially, the queue is empty. The bfs
tree is represented by an array of “dad,” that is, dad[j] = i if the vertex i is
the father of the vertex j in the tree, that is, ij is an edge of the bfs tree under
construction. Initially, the set of edges of the bfs tree T is empty. We use an
array bf sn[1..n] where bfsn stands for the breadth-first search number. More
precisely, bf sn[i] = k if i is the kth vertex “visited” during the bfs. Thus, the
bfsn of the root vertex of the first tree in F is 1. The integer variable counter
is initialized to 1. The algorithm is given below in pseudo-code.
Note that if the graph is not connected, bfs will be called from the main
program on a vertex of each connected component, that is, there will be as
many calls to bfs as the number of connected components of the graph. Let
us now discuss the complexity of bfs.
Property 2.1 (Level sets: Partition of the vertex set induced by bfs).
Bfs induces a partition of the vertex set of a connected graph with the following
properties:
The vertex set
X = L0 (a) ∪ L1 (a) ∪ · · · ∪ Lk (a)
where a is the root vertex, k is the eccentricity of the vertex a and the set
Li (a) consists of all vertices at distance i from the vertex a. Symbolically,
Li (a) = { x | d(a, x) = i }.
The sets Li (a) are called level sets with respect to the vertex a. By the definition
of level sets, there cannot be an edge of the graph between vertices of the level
sets Li (a) and the level sets Lj (a) if |i − j| > 1.
This partition can be easily obtained from the bfs algorithm as follows:
An array variable L[1..n] of integer is declared together with an integer
variable k to assign levels to vertices. k is initialized to 1. In the beginning
of the bfs procedure, we add the instruction L[a] = 0 and write the statement
L[z] := k just after the instruction mark[z] := 1. Finally, the variable k is
updated by the statement k := k + 1 just before the end of the while loop.
Once the array L is available, to find the vertices at level i from the root
vertex a, we simply scan the L array to print the vertices x for which L[x] = i.
Geodetic graph:
An undirected graph is geodetic if between any two vertices there is a unique
elementary path of shortest length. Let us recall that the length of an elemen-
tary path is the number of edges in the path.
By the definition, a geodetic graph must be a simple connected graph. For
example, a tree is a geodetic graph. Other examples are the famous Petersen
graph of Figure 2.3 and any elementary cycle of odd length.
vertex x, a vertex y of the ith level set Li (x) is joined by an edge to exactly
one vertex z in the (i − 1)th level set Li−1 (x).
Proof. Let G be a geodetic graph. Then, each vertex y in Li (x) is adjacent to
exactly one vertex z of Li−1 (x). For, if there is a vertex y ∈ Li (x) adjacent to
two vertices z1 , z2 belonging to Li−1 , then by the definition of level sets, there
are two paths of shortest length i from the root vertex x to the vertex z, one
going through z1 and the other passing via z2 , a contradiction.
For the converse part, suppose that every vertex of the ith level set is joined
to exactly one vertex of the (i − 1)th level set, for all root vertex x ∈ X. If G is
not geodetic, then there are vertices x and y such that there are two shortest
paths P1 and P2 from x to y. Let z = y be the last vertex common to both
P1 and P2 while going from x to y. Then, clearly, z = x for, otherwise the two
paths will be identical. Now, if the distance between x and z is i, then in the
bfs with x as the root vertex, the vertex z in the ith level set has two adjacent
vertices in the (i − 1)th level set, a contradiction.
We are now ready to describe the algorithm:
Input: A connected simple graph G.
Output: “1” if G is geodetic, “0” if not.
Algorithm: By the above characterization of geodetic graphs, G is geodetic if
and only if there is no dotted edge between two consecutive level sets, for any
root vertex x ∈ X.
The algorithm is given below in pseudo-code: We declare an integer vari-
able “geo” which is initialized to 1. If there is a dotted edge between two
consecutive level sets, then the variable “geo” is set to 0 and the procedure
“bfs” terminates, thanks to the statement “return” and the control falls back
to the statement which activated the procedure “bfs.” L[s] = L[f ] + 1 if and
only if the vertex f is the father of the vertex s in the bfs tree. Note that the
square brackets in L[x] denote the level number of the vertex x, whereas the
parenthesis in L(x) denotes the list of vertices adjacent to the vertex x.
then begin
mark[z] :=1;
enqueue z in the queue Q;
dad[z]:=y;(* add the edge yz to the T*)
L[z]:= L[y]+1;(*assign level number to z*)
end
else (*z already visited*)
if L[y]=L[z]+1
then
if dad[y] is not z
(* yz is a dotted edge joining consecutive level sets*)
then begin geo:=0; return; end;
end;(*while*)
end;(*bfs*)
Remark 2.1. During the bfs, each edge is traversed twice. A dotted edge
joining two vertices of the same level is first traversed from left to right and a
dotted edge xy joining two vertices of consecutive levels with x in level i and y
in level i + 1 is first traversed from x to y, that is, geometrically, assuming
that the sons of a vertex are drawn in the left to right manner, the vertex x
is to the right of the vertex y. Hence in the above algorithm, we can simply
replace the “else” part of the “for loop” in the procedure “bfsgeodetic” by the
following statement:
if L[y] = L[z] − 1 then geo:= 0;
Property 2.2. A graph is bipartite if and only if there are no dotted edges
joining two vertices of the same level sets.
vertices xp+1 and xp+2 are in the level set Lp (x1 ). But then, the vertices xp+1
and xp+2 are joined by a dotted edge (since they are consecutive vertices of
the cycle C).
Hence the property.
112 Discrete Mathematics
Having seen Properties 2.2, 2.3 and Example 2.2, we can now write the
algorithm to test if a given graph is bipartite.
We perform a level order search from an arbitrary vertex. We use a Boolean
variable “bip” which is initialized to 1 (means true). If we find a dotted
edge joining two vertices in the same level set, then the variable “bip” is set
0 (means false) and the procedure “bfs bipartite” terminates thanks to the
instruction “return.” If the graph is bipartite, the algorithm writes the bipar-
tition by scanning through the array L[1..n] printing the evenly subscripted
vertices in one line and oddly subscripted vertices in the next line (n is the
number of vertices of the graph).
Complexity of the algorithm “bfs bipartite”:
Clearly the complexity is the same as that of the bfs which is O(max(m, n)).
Example 2.3
M = { f a, gb, hc, id, je } is a matching of graph G of Figure 2.6.
Another matching is M = { ab, cd, f j, gh }.
Example 2.4
We shall see below a few real-world situations in which matchings
arise naturally.
We would like to distribute 64 square chocolates among 32 children.
Let us imagine that the 64 pieces are arranged in the form of an 8 × 8
square, like a Chess board. The first child gets two square pieces which
are at the diagonally opposite corners of the 8 × 8 square.
Question: Is it possible to distribute the remaining 62 square pieces
(corresponding to the truncated Chess board) among the remaining 31
children in such a way that each child gets a 1 × 2 rectangular piece?
The problem is equivalent to finding the existence of a perfect match-
ing in the following graph: The vertices correspond to the 62 squares of
the truncated Chess board and two vertices are joined by an edge if the
squares corresponding to vertices share a common side, not merely a
single point. We can demonstrate that the graph thus defined does not
possess a perfect matching. We will use the following parity argument
(see Tables 2.1 and 2.2).
Color the squares of an 8 × 8 Chess board alternately black and
white, that is, if a square s is colored black, then a square sharing a
common side with the square s should be colored white and vice versa.
Note that the two diagonally opposite corner squares are colored with
the same color. Hence the truncated Chess board has either 32 black
Graph Algorithms II 115
Proof. By the definition of an augmenting path, the initial and final vertices
of P are distinct and are unsaturated by the matching. Hence the path starts
with an initial edge not in M and ends with an edge also not in M . Hence the
number of edges in the path P must be odd.
Let us recall the operation of symmetric difference of two sets: For any two
sets A and B, the symmetric difference of A and B denoted by AΔB is the
set (A ∪ B) \ (A ∩ B) which is also equal to (A \ B) ∪ (B \ A).
connected components of the graph H can only be one of the following three
types:
Type 1: An isolated vertex x, that is, the degree of the vertex x in the
graph H is zero.
Proof. Let us first observe that the degree of each vertex of the spanning sub-
graph (X, M1 ) is either 0 or 1, since M1 is a matching. Similar observation
holds for the spanning subgraph (X, M2 ). Hence the degree of each vertex x in
the spanning subgraph H = (M1 ΔM2 ) is at most 2, because x can be incident
with at most one edge of M1 and at most one edge of M2 . This means that
the only possible components of the spanning subgraph are isolated vertices,
an elementary path, and an elementary cycle.
(See Figure 2.10. The spanning subgraph induced by the symmetric dif-
ference of matchings M1 ΔM2 has five connected components of which three
are isolated vertices, one is an elementary path of length 2 and the other is
an elementary cycle of length 4.)
By the definition of M1 ΔM2 = (M1 ∪ M2 ) \ (M1 ∩ M2 ), the edges alternate
in M1 and M2 in each of the component containing at least one edge of H.
Hence the length of the cycle component must be even, otherwise either M1
or M2 will not be a matching.
Finally, for a path component, if the initial edge belongs to the matching
M1 , then the initial vertex of the path component cannot be saturated by
M2 , since otherwise either the degree of the initial vertex will be 2 in H or
the initial edge belongs to both M1 and M2 , a contradiction to the symmetric
difference M1 ΔM2 . A similar argument holds for the final vertex of any path
component of H.
2 ≤ i ≤ 2p and the other edges of the path are in E \ M with x1 and x2p+1
unsaturated by M . Now define a new matching M which consists of the edges
of the symmetric difference
(M ∪ P ) \ (M ∩ P ) = { M ∪ { x1 x2 , x2 x3 , . . . , x2p x2p+1 }\
{ x2 x3 , x4 x5 , . . . , x2p x2p+1 }.
(Note that P represents the set of edges of the path P .) Then, M is a matching
and |M | = |M | + 1, since we have removed p alternate edges of P from M
and added p + 1 alternate edges of P to M to obtain the new matching M .
Hence M is not a maximum matching.
For the converse part, consider a matching M which is not maximum. We
shall prove the existence of an M -augmenting path. Let M be a maximum
matching in G. Then, |M | > |M |.
Consider the spanning subgraph H = (X, M ΔM ) induced by the edge
set (M \ M ) ∪ (M \ M ). By Lemma 2.1, the connected components of H are
isolated vertices or elementary cycles of even length with edges alternately in
M and M or an elementary path with edges alternately in M and M with
initial and final vertices saturated by exactly one of the matchings M and M .
120 Discrete Mathematics
Step 1: Apply Floyd’s algorithm to the matrix M = (mij )n×n (see chapter 1).
The (i,j) entry of M is the length of a shortest path between vertices i and
j in G
In step 2, we find a peripheral vertex of G
Step 2:
for i:=1 to n do
begin
set e[i] = max(mij |1 ≤ j ≤ n)
end
The ith entry of array e gives the eccentricity of vertex i
find a vertex p such that e[p] = max(e[i]|1 ≤ i ≤ n)
Step 3:
Construct the subgraph Gp = G \ p \ Γ(p)
find a perfect matching M of Gp
return G \ { p ∪ M }
of S, denotes the set of vertices which are collectively adjacent to the vertices
of S.
Proof. We first prove the easy part of the theorem. Suppose there is a set
S of X1 satisfying the inequality |N (S)| < |S|. Then, we will prove that
there is no matching in G saturating each vertex of the set X1 . If M is a
matching saturating each vertex of X1 , then since S is a subset of X1 , there
is a subset M of M such that M saturates each vertex of S and |M | = |S|.
M is also a matching in G (since a subset of a matching is a matching) and
the vertices in S are matched under M with distinct vertices of the set N (S).
This implies that |N (S) ≥ |S|, a contradiction.
Now consider the converse part. Suppose the graph G possesses no match-
ing saturating all the vertices of X1 . We shall show that there is a subset S
of X1 satisfying the inequality |S| < |N (S)|. By our assumption, M does not
saturate all the vertices of the set X1 . Hence there must exist a vertex x1 ∈ X1
unsaturated by the maximum matching M . Let R be the set of all vertices of
the graph G reachable by an M -alternating path from the initial vertex x1 .
Note that the path of length zero (x1 ) is vacuously an M -alternating path
and hence the set R is non-empty. Since the matching M is maximum, by
Theorem 2.2, x1 is the only unsaturated vertex in the set R.
Let S be the set of vertices of R belonging to the set X1 and let T be
the set of vertices of R belonging to the set X2 , that is, S = R ∩ X1 and
T = R ∩ X2 . Then, we shall show that
N (S) = T.
The following example illustrates the fact that Theorem 2.3 is a good
characterization of a bipartite graph G = (X1 , Y1 ; E) possessing no matching
saturating all the vertices of the set X1 .
We shall now prove that the dancing problem (see Example 2.7) has always
an affirmative answer:
This implies that, |S| ≤ |N (S)| for every subset S of X1 . By the König-Hall
Theorem 2.3, G contains a matching M saturating every vertex of X1 . Since
|X1 | = |X2 |, the matching M must be a perfect matching.
The following corollary asserts that the edge set of a regular bipartite
graph containing at least one edge can be partitioned into edge disjoint union
of perfect matchings.
Corollary 2.2. The edge set of a k-regular bipartite graph (k > 0) can be
partitioned into k edge-disjoint perfect matchings.
E = M 1 ∪ M2 · · · ∪ M k
with Mi ∩ Mj = ∅ for all i, j with 1 ≤ i < j ≤ k.
E = M 1 ∪ M2 · · · ∪ M k .
Example 2.11
The matrix corresponding to the bipartite 2-regular graph of Figure
2.16 is
y y2 y3
⎛ 1 ⎞
x1 0 1 1
x2 ⎝ 1 0 1 ⎠
x3 1 1 0
The perfect matching M = { x1 y2 , x2 y3 , x3 y1 } of the graph of Figure
2.16 corresponds to the permutation matrix P where
⎡ ⎤
0 1 0
P =⎣ 0 0 1 ⎦
1 0 0
E = M 1 ∪ M2 · · · ∪ M k .
mij = mij = 1 = n.
i=1 j=1 j=1 i=1 j=1
The same sum is also equal to the sum of m rows of M = m (as each row
sum is 1). Symbolically,
⎛ ⎞
m
n m
n m
mij = ⎝ mij ⎠ = 1 = m.
i=1 j=1 i=1 j=1 i=1
Hence m = n.
Graph Algorithms II 129
1/2 0 1/2 0
M is bi-stochastic.
Determinants:
Recall that for an n × n square matrix M = (mij , with mij reals, the deter-
minant of M denoted by det M is defined as the sum of the products of the
entries of M taken only one entry from each row and each column of the
matrix M . Symbolically,
where the sum is taken over all possible permutations Sn of the set
{ 1, 2, . . . , n }.
Remark 2.2. Of course, this definition does not give an efficient way to cal-
culate the determinants because there are n! terms in the det M . Surprisingly,
determinants of order n can be calculated in polynomial time, in fact in O(n3 )
time by row reducing the matrix M to triangular form.
Lemma 2.2. Consider an n × n square matrix M = (mij ). In the expansion
of the determinant of the matrix M , each term is zero, if and only if there
is a regular submatrix (a matrix which can be obtained from M by removing
certain rows and columns of M ) of order k ×(n−k +1), (k ≤ n) whose entries
are all zeros.
Proof. We will apply the König-Hall Theorem 2.3 to prove the lemma.
Let us associate a bipartite graph G = (X1 , Y1 ; E) with X1 =
{ x1 , x2 , . . . , xn } and Y1 = { y1 , y2 , . . . , yn } with the edge set E = { xi yj |
mij = 0 for all i, j }.
Each term of the determinant is of the form ±m1i1 m2i2 · · · mnin . But then
this term is non-zero, if and only if each mpip = 0 for all p with 1 ≤ p ≤ n.
This is true if and only if the set of edges { x1 yi1 , x2 yi2 , . . . , xn yin } form a
perfect matching of the corresponding bipartite graph G (see Example 2.11),
that is, if and only if G has a matching saturating each vertex of X1 .
Therefore, every term in the expansion of the det M is zero if and only if
the corresponding bipartite graph contains no matching saturating each vertex
of X1 . By the König-Hall theorem, this is so, if and only if there is a subset S
of the set X1 satisfying the equality N (S) = |S| − 1. Set |S| = k. Then,
130 Discrete Mathematics
⎝ tij⎠ = n−k −sum of all the entries of N = n−k −(n−k +1) = −1,
i=k+1 j=1
v = α1 v1 + α2 v2 + · · · + αn vn
n
where each αi ≥ 0 and i=1 α1 = 1.
Before presenting the theorem, let us see an example.
⎡ ⎤ ⎡ ⎤
0 1 0 0 0 0 0 1
⎢ 1 0 0 0 ⎥ ⎢ 1 0 0 0 ⎥
P1 = ⎢
⎣ 0
⎥ P2 = ⎢
⎣ 0
⎥
0 1 0 ⎦ 1 0 0 ⎦
0 0 0 1 0 0 1 0
⎡ ⎤ ⎡ ⎤
1 0 0 0 0 0 1 0
⎢ 0 0 0 1 ⎥ ⎢ 0 1 0 0 ⎥
P3 = ⎢
⎣ 0
⎥ P4 = ⎢
⎣ 0
⎥
1 0 0 ⎦ 0 0 1 ⎦
0 0 1 0 1 0 0 0
132 Discrete Mathematics
R2 = R1 − α2 P2 = M − α1 P1 − α2 P2 .
As before, the matrix R2 has more zero entries than the matrix R1 . If every
term in the expansion of the det R2 is zero, then the theorem is proved, because
this implies that the matrix R2 is the zero matrix. Otherwise, we continue the
same argument.
Sooner or later we must have the equation
Rs = M − α1 P1 − α2 P2 − · · · − αs Ps (2.2)
with every term in the expansion of Rs equal to zero. Then, we claim that
the matrix Rs is the z ero matrix. Otherwise, by Equation (2.2), the sum
Graph Algorithms II 133
M = α1 P1 + α2 P2 + · · · + αs Ps
FIGURE 2.19: Algorithm to find a matching that saturates all the vertices
of the set X1 or else a succinct certificate.
We are now ready to present the Kuhn-Munkres algorithm for the weighted
matching problem in bipartite graph.
138 Discrete Mathematics
y1 y2 y3 y4
⎛ ⎞
x1 4 3 2 2
x2 ⎜
⎜ 2 3 5 1⎟ ⎟.
W =
x3 ⎝ 3 0 2 0⎠
x4 3 4 0 3
A good labeling function l is given as follows: l(x1 ) = 4 l(x2 ) = 5
l(x3 ) = 3 l(x4 ) = 4 and l(y1 ) = l(y2 ) = l(y3 ) = l(y4 ) = 0. In fact,
l(xi ) is defined as the maximum value of the entry in the ith row and
l(yj ) is defined as zero for all j.
There are n2 edges in the complete bipartite graph Kn,n . The following
lemma on which the algorithm is based says “many” of the n2 edges can
be eliminated from the graph in order to find a maximum weighted perfect
matching.
In other words, the spanning subgraph Gl consists of all edges of G such that
the weight of the edge is equal to the sum of the labels of its end vertices. If
the graph Gl contains a perfect matching M, then M is a maximum weighted
perfect matching of the graph G.
Proof. Let M ⊂ El be a perfect matching in the graph Gl . Then, we have to
show that M is a maximum weighted perfect matching of the graph G. Since
El ⊂ E, M is also a perfect matching of the graph G. Let M be a perfect
matching of G. We have to show that w(M ) ≤ w(M ).
Now,
w(M ) = wij
xi yj ∈M ⊂El
= (l(xi ) + l(yj ))
xi yj ∈M ⊂El
But then,
= (l(xi ) + l(yj ))
xi yj ∈M ⊂E
Definition 2.4 (New labeling from the old one). First of all, define the
number dl as
dl = min l(xi ) + l(yj ) − w(xi yj ) . (2.5)
xi ∈Sl ,yj ∈Y1 −Tl
Here, the set Tl denotes the neighbor set N (Sl ) in the graph Gl . Note that
by the definition, the number dl satisfies the inequality dl > 0. Now the new
Graph Algorithms II 141
Now we construct the new spanning subgraph Gl1 of the graph G. Because
of the definition of the labeling function l1 , in the new graph Gl1 all the edges
between the sets Sl and Tl in the old graph Gl will also belong to the new
graph Gl1 . In addition, at least one new edge (e.g., the edge x1 y2 in the graph
Gl1 of Example 2.19) between the set Sl and the set Y1 − T will appear in
the new graph Gl1 . This new edge enables us to grow the M -alternating tree
still further in the graph Gl1 . An edge of the old graph Gl having one end in
142 Discrete Mathematics
X1 − Sl and the other end in the set Tl may disappear in the new graph Gl1 .
All other edges between the sets X1 − Sl and Y1 − Tl of the graph Gl will be
conserved in the new graph Gl1 .
If the new graph Gl1 contains a perfect matching M , then by virtue
of Lemma 2.4, M is a maximum weighted perfect matching in the original
graph G and the algorithm STOPS.
If not, we find a maximum matching (but not perfect) and the corre-
sponding succinct certificate in the graph Gl1 and we continue the algorithm
as before. This continuation is explained in the example that follows:
2.5 Exercises
1. Write a complete program in C for the breadth-first search of a simple
graph. (The graph is represented by its adjacency lists.)
2. Can we use the breadth-first search procedure to find the minimum cost
in an undirected connected graph, from a given source vertex, to all
other vertices, with weights associated with each edge of the graph? If
the answer is “yes,” write a pseudo-code to find the minimum costs of
all paths from the source. The weight of each edge is > 0.
11. Prove that the Petersen graph is not a Hamiltonian graph; that is, it
does not contain an elementary cycle passing through each vertex exactly
once.
y1 y2 y3 y4
⎛ ⎞
x1 4 4 3 2
x2 ⎜
⎜ 3 5 5 2⎟ ⎟
W =
x3 ⎝ 3 2 2 4⎠
x4 3 4 5 3
14. Write the following bi-stochastic matrix A as a convex combination of
permutation matrices:
⎛ ⎞
1/4 1/2 0 1/4
⎜ 1/2 1/4 1/4 0 ⎟
A=⎜ ⎝ 0
⎟
1/4 1/2 1/4 ⎠
1/4 0 1/4 1/2
15. A diagonal of a real square matrix n×n is a set of n entries of the matrix
no two of which are in the same row or in the same column. The weight
of a diagonal is the sum of the elements of the diagonal. Find a diagonal
of maximum weight in the following 4 × 4 matrix A where
⎛ ⎞
5 3 2 2
⎜ 2 3 5 3 ⎟
A=⎜ ⎝ 3 4 2 3 ⎠
⎟
3 4 2 5
16. Using the König-Hall theorem for perfect matching in a bipartite graph
proved in this chapter, prove the following theorem due to the eminent
graph theorist: Tutte.
Let G = (X,E) be a simple graph. For a subset S ⊂ X, denote by
co (G − S), the number of connected components with odd number of
vertices of the induced subgraph X − S.
Tutte’s theorem: A graph G has a perfect matching if and only if
co (G − S) ≤ |S| for all subsets S ⊂ X
(For a proof, see [11].)
Chapter 3
Algebraic Structures I (Matrices,
Groups, Rings, and Fields)
3.1 Introduction
In this chapter, the properties of the fundamental algebraic structures,
namely, matrices, groups, rings, vector spaces, and fields are presented. In
addition, the properties of finite fields which are so basic to finite geometry,
coding theory, and cryptography are also discussed.
3.2 Matrices
A complex matrix A of type (m, n) or an m by n complex matrix is a
rectangular arrangement of mn complex numbers in m rows and n columns
in the form: ⎛ ⎞
a11 a12 . . . a1n
⎜ a21 a22 . . . a2n ⎟
A=⎜ ⎟
⎝. . . . . . . . . . . . . . .⎠ .
am1 am2 . . . amn
147
148 Discrete Mathematics
A(B + C) = AB + AC,
(A + B)C = AC + BC, and
(AB)C = A(BC)
where B11 and B21 have two rows each. Hence B12 and B22 have also two
rows each. This forces B11 and B21 to have two columns each while B12 and
Algebraic Structures I (Matrices, Groups, Rings, and Fields) 149
and we find that the matrix on the right, on simplification, does indeed yield
the product AB.
Note that it is not enough if we simply partition the matrices A and B.
What is important is that the partitioned matrices should be conformable
for multiplication. This means, in the above example, that the 8 products
A11 B11 , A12 B21 , . . . A22 B22 are all defined.
The general case is similar.
i. (At )t = A, and
ii. (AB)t = B t At , whenever the product AB is defined
ai1 Aj1 + ai2 Aj2 + · · · + ain Ajn = a1j A1i + a2j A2i + · · · + anj Ani = det A or 0
Corollary 3.1. Let A be a non-singular matrix, that is, det A = 0. Set A−1 =
(1/det A)(adj A). Then, AA−1 = A−1 A = In , where n is the order of A.
The matrix A−1 , as defined in Corollary 3.1, is called the inverse of the
(non-singular) matrix A. If A, B are square matrices of the same order with
AB = I, then B = A−1 and A = B −1 . These are easily seen by premultiplying
the equation AB = I by A−1 and postmultiplying it by B −1 . Note that A−1
and B −1 exist since taking determinants of both sides of AB = I, we get
3.3.7 Exercises
1+2k −4k
1. If A = 31 −4
−1 , prove by induction that A =
k
k 1−2k for any posi-
tive integer k.
cos(nα) sin(nα)
2. If M = −cos α sin α
sin α cos α , prove that M n
= − sin(nα) cos(nα) , n ∈ N .
1 −10
3. Compute the transpose, adjoint and inverse of the matrix 0 1 −1 .
1 0 1
1 3
4. If A = −2 2 , show that A2 − 3A + 8I = 0. Hence compute A−1 .
10. Show that every complex square matrix is the unique sum of a Hermitian
and a skew-Hermitian matrix.
3.4 Groups
Groups constitute an important basic algebraic structure that occurs very
naturally not only in mathematics, but also in many other fields such as
physics and chemistry. In this section, we present the basic properties of
groups. In particular, we discuss Abelian and non-Abelian groups, cyclic
groups, permutation groups and homomorphisms and isomorphisms of groups.
We establish Lagrange’s theorem for finite groups and the basic isomorphism
theorem for groups.
Examples
1. (N, ·) is a semigroup, where · denotes the usual multiplication.
Definition 3.3. A group is a binary system (G, ·) such that the following
axioms are satisfied:
b=b·e
= b · (a · c) (as c is an inverse of a)
= (b · a) · c by the associativity of · in G
= e · c (as b is an inverse of a)
= c.
Thus, henceforth, we can talk of “The identity element e” of the group (G, ·),
and “The inverse element a−1 of a” in (G, ·).
154 Discrete Mathematics
Lemma 3.1. In a group, both the cancellation laws are valid, that is, if a, b, c
are elements of a group G with ab = ac, then b = c (left cancellation law), and
if ba = ca, then b = c (right cancellation law).
Examples Continued
1. [Klein’s 4-group K4 ] This is a group of order 4. If its elements are e, a, b, c,
the group table of K4 is given by Table 3.1.
This gives
Thus, f r leaves B fixed and flips A and C in ABC. There are six con-
gruent transformations of an equilateral triangle and they form a group as per
group Table 3.2
For instance, r2 f r and rf are obtained as follows:
Thus, r2 f r = rf , and similarly the other products can be verified. The result-
ing group is known as the dihedral group D3 . It is of order 2 · 3 = 6.
r4 = e = f 2 = (rf )2 .
3.7 Subgroups
Definition 3.6. A subset H of a group (G, ·) is a subgroup of (G, ·) if (H, ·)
is a group.
Proof. By definition,
S is contained in every subgroup H of G containing S.
Since
S is itself a subgroup of G containing S, it is the smallest subgroup of
G containing S.
Algebraic Structures I (Matrices, Groups, Rings, and Fields) 159
By Corollary 3.2,
a is the smallest subgroup of G containing a. As a ∈
2. The group of n-th roots of unity, n ≥ 1. Let G be the set of n-th roots
of unity so that
2π 2π
G = ω, ω 2 , . . . , ω n = 1; ω = cos + i sin .
n n
Then, G is a cyclic group of order n generated by ω, that is, G =< ω >.
In fact ω k , 1 ≤ k ≤ n, also generates k iff (k, n) = 1. Hence, the number
of generators of G is φ(n), where φ is the Euler’s totient function.
160 Discrete Mathematics
If G =
a = {an : n ∈ Z}, then since for any two integers n and m,
a a = an+m = am an , G is Abelian. In other words, every cyclic group
n m
is Abelian. However, the converse is not true. K4 , the Klein’s 4-group (See
Table 3.1 of Section 3.4) is Abelian but not cyclic since K4 has no element of
order 4.
Proof. Let G =
a be a cyclic group, and H, a subgroup of G. If H = {e},
then H is trivially cyclic. So assume that H = {e}. As the elements of G are
powers of a, an ∈ H for some non-zero integer n. Then, its inverse a−1 also
belongs to H, and of n and −n at least one of them is a positive integer. Let
s be the least positive integer such that as ∈ H. (recall that H = {e} as per
our assumption). We claim that H =
as , the cyclic subgroup of G generated
by as . To prove this, we have to show that each element of H is a power of as .
Let g be any element of H. As g ∈ G, g = am for some integer m. By division
algorithm,
m = qs + r, 0 ≤ r < s.
is also a permutation on S.
Let B denote the set of all bijections on S. Then, it is easy to verify that
(B, ·), where · is the composition map, is a group. The identity element of this
group is the identity function e on S.
Algebraic Structures I (Matrices, Groups, Rings, and Fields) 161
What is the order of the group P? Clearly σ(1) has n choices, namely, any
one of 1, 2, . . . , n. Having chosen σ(1), σ(2) has n − 1 choices (as σ is 1 − 1,
σ(1) = σ(2)). For a similar reason, σ(3) has n−2 choices and so on, and finally
σ(n) has just one left out choice. Thus, the total number of permutations on
S is
n · (n − 1) · (n − 2) · · · 2 · 1 = n!.
Example
Let S = {1, 2, 3, 4, 5}, and let σ and τ ∈ S5 be given by
1 2 3 4 5 1 2 3 4 5
σ= , τ= .
2 3 5 4 1 5 2 4 1 3
Then,
1 2 3 4 5 2 1 2 3 4 5
σ · τ = στ = , and σ = σ · σ = .
1 3 4 2 5 3 5 1 4 2
Example 3.1
1 2 3 4 5 6 7 8 9
Let σ = .
4 5 1 2 3 7 9 8 6
Then, σ = (14253)(679)(8)
= (13)(15)(12)(14)(69)(67)
= a product of an even number of transpositions.
Lemma 3.3. Any two left cosets of a subgroup H of a group G are equipotent
(that is, they have the same cardinality). Moreover, they are equipotent to H.
φ : aH → bH
Example 3.2
It is not necessary that aH = Ha for all a ∈ G. For example, consider
S3 , the symmetric group of degree 3. The 3! = 6 permutations of S3 are
given by
⎧
⎪
⎪ 1 2 3 1 2 3 1 2 3
⎪
⎪
⎨e = 1 2 3 , 1 3 2
= (23),
3 2 1
= (13)
S3 =
⎪ 1 2 3
⎪ 1 2 3 1 2 3
⎪
⎪
⎩ 2 1 3 = (12), 2 3 1
= (123),
3 1 2
= (132).
so that aH = Ha.
Proof. Suppose aH and bH are two left cosets of the subgroup H of a group G,
where a, b ∈ G. If aH and bH are disjoint, there is nothing to prove. Otherwise,
aH ∩ bH = φ, and therefore, there exist h1 , h2 ∈ H with ah1 = bh2 . This
however means that a−1 b = h1 h−1
2 ∈ H. So by Proposition 3.2, aH = bH.
Example 3.3
For the subgroup H of Example 3.2, we have seen that (123)H =
{(123), (13)}. Now (12)H = {(12)e, (12)(12)} = {(12), e} = H, and
hence (123)H ∩ (12)H = φ. Also (23)H = {(23)e, (23)(12)} = {(23) ,
(132)}, and (13)H = (13){e, (12)} = {(13), (123)} = (123)H. Note that
(13)−1 (123) = (13)(123) = (12) ∈ H (refer to Proposition 3.2).
ω i ←→ i, 0 ≤ i ≤ 5,
f (a + b) = f (a)f (b)
and so on.
We remark that the homomorphism in the last list of examples is not onto
while those in Examples 1, 2 and 4 are onto. The homomorphisms in Examples
1 and 2 are isomorphisms. The isomorphism in Example 1 is an isomorphism
168 Discrete Mathematics
Property 3.1. f (e) = e , that is, the image of the identity element e of G
under f is the identity element e of G .
Proof.
i. Let f (a), f (b) ∈ f (G), where a, b ∈ G. Then, f (a)f (b) = f (ab) ∈ f (G),
as ab ∈ G.
ii. The associative law is valid in f (G). As f (G) ⊂ G and G being a group,
f (G) satisfies the associative law.
iii. By Property 3.1, the element f (e) ∈ f (G) acts as the identity element
of f (G).
Algebraic Structures I (Matrices, Groups, Rings, and Fields) 169
Proof.
−1
f (a) = f (b) ⇔ f (a) (f (b)) = e , the identity element of G
⇔ f (a)f (b−1 ) = e (By Property 3.2)
−1
⇔ f (ab )=e
−1
⇔ ab ∈ K.
Example 3.4
0 2
Let G = ω = 1, ω, ω be the group of cube roots of unity, where
ω = cos(2π/3) + i sin(2π/3). Let f : G → G be defined by f (ω) = ω 2 .
To make f a group homomorphism, we have to set f (ω 2 ) = f (ω · ω) =
f (ω)f (ω) = ω 2 ·ω 2 = ω, and f (1) = f (ω 3 ) = (f (ω))3 = (ω 2 )3 = (ω 3 )2 =
13 = 1. In other words, the homomorphism f : G → G is uniquely
defined on G once we set f (ω) = ω 2 . Clearly, f is onto. Further, only
1 is mapped to 1 by f , while the other two elements ω and ω 2 are
interchanged by f . Thus, Ker f = {1}. So by Property 3.7, f is an
isomorphism of G onto G, that is, an automorphism of G.
Our next theorem shows that there is a natural way of generating at least
one set of automorphisms of a group.
The conditions (3.3) and (3.4) give the following equivalent definition of a
normal subgroup.
Definition 3.20. A subgroup N of a group G is normal in G iff aN a−1 = N
(equivalently, aN = N a) for every a ∈ G.
(aH)(a−1 H) = (aa−1 )H = eH = H,
Example 3.5
We now present an example of a quotient group. Let G = (R2 , +), the
additive group of points of the plane R2 . (If (x1 , y1 ) and (x2 , y2 ) are two
points of R2 , their sum (x1 , y1 ) + (x2 , y2 ) is defined as (x1 + x2 , y1 + y2 ).
The identity element of this group (0, 0) and the inverse of (x, y) is
(−x, −y)). Let H be the subgroup: {(x, 0) : x ∈ R} = X-axis. If (a, b) is
any point of R2 , then
=line through (a, b) parallel to X-axis. Clearly if (a, b)+H = (a , b )+H,
then ((a − a), (b − b)) ∈ H) = X-axis and therefore the Y-coordinate
b − b = 0 and so b = b. In other words, the line through (a, b) and the
line through (a , b ), both parallel to the X-axis, are the same iff b = b ,
as is expected (See Figure 3.1). For this reason, this line may be taken
as (0, b) + H. Thus, the cosets of H in R are the lines parallel to the
X-axis and therefore the elements of the quotient group R/H are the
lines parallel to the X-axis. If (a, b)+H and (a , b )+H are two elements
of R/H, we define their sum to be (a + a , b + b ) + H = (0, b + b ) + H,
the line through (0, b + b ) parallel to the X-axis. Note that (R2 , +)
is an Abelian group and so H is a normal subgroup of R2 . Hence the
above sum is well defined. The above addition defines a group structure
on the set of lines parallel to the X-axis, that is, the elements of R/H.
The identity element of the quotient group is the X-axis = H, and the
inverse of (0, b) + H is (0, −b) + H.
element of G . But f (g1 )f (g2 )−1 = f (g1 )f (g2−1 ) = f (g1 g2−1 ). Hence
g1 g2−1 = e , and so g1 g2−1 ∈ K, and consequently, g1 K = g2 K (by
Property 3.4). Thus, φ is 1 − 1.
3.16 Exercises
1. Let G = SL(n, C) be the set of all invertible complex matrices A of
order n. If the operation · denotes matrix multiplication, show that G is
a group under ·.
2. Let G denote the set of all real matrices of the form ( ab 01 ) with a = 0.
Show that G is a group under matrix multiplication.
3. Which of the following semigroups are groups?
i. (Q, ·)
ii. (R∗ , ·)
iii. (Q, +)
iv. (R∗ , ·)
v. The set of all 2 by 2 real matrices under matrix multiplication.
vi. The set of all 2 by 2 real matrices of the form ( ab 01 ).
4. Prove that a finite semigroup in which the right and left cancellation laws
are valid is a group, that is, if H is a finite semigroup in which both the
cancellation laws are valid (that is, ax = ay implies that x = y, and
xa = ya implies that x = y, where a, x, y ∈ H), is a group.
5. Prove that in semigroup G in which the equations ax = b and yc = d,
are solvable in G, where a, b, c, d ∈ G, is a group.
8. Prove that any group of even order has an element of order 2. (Hint:
a = e and o(a) = 2 iff a = a−1 .) Pair off such elements (a, a−1 ).
9. Give an example of a non-cyclic group where each of its proper subgroup
is cyclic.
10. Show that no group can be the set union of two of its proper subgroups.
21. Show that any infinite cyclic group is isomorphic to (Z, +).
22. Show that the set {ein : n ∈ Z} forms a multiplicative group. Show that
this is isomorphic to (Z, +). Is this group cyclic?
23. Find a homomorphism of the additive group of integers to itself that is
not onto.
28. Give the group table of the group S3 . From the table, find the center of
S3 .
29. Show that if a subgroup H of a group G is generated by a subset S of
G, then H is a normal subgroup iff aSa−1 ⊂
S for each a ∈ G.
30. Show that a group G is Abelian iff the center C(G) of G is G.
178 Discrete Mathematics
33. Let G be the set of all roots of unity, that is, G = {ω ∈ C : ω n = 1 for
some n ∈ N}. Prove that G is an Abelian group that is not cyclic.
34. If A and B are normal subgroups of a group G such that A ∩ B = {e}.
Then, show that for ∀a ∈ A and ∀b ∈ B, ab = ba.
35. If H is the only subgroup of a given finite order in a group G, show that
H is normal in G.
3.17 Rings
The study of commutative rings arose as a natural abstraction of the alge-
braic properties of the set of integers, while that of fields arose out of the sets
of rational, real and complex numbers.
We begin with the definition of a ring and then proceed to establish some
of its basic properties.
R3 : For all a, b, c ∈ A,
a · (b + c) = a · b + a · c (left distributive law)
(a + b) · c = a · c + b · c (right distributive law).
It is customary to write ab instead of a · b.
Examples of Rings
1. A = Z, the set of all integers with the usual addition + and the usual
multiplication taken as ·.
2. A = 2Z, the set of even integers with the usual addition and multiplica-
tion.
Example 3.6
Let A = M2 (Z), the set of all 2 by 2 matrices with integers as entries.
A is a ring with the usual matrix addition + and usual
matrix multipli-
cation ·. It is a non-commutative ring since M = 10 10 , N = 01 01 are
in A, but M N = N M .
180 Discrete Mathematics
[ 00 01 ] [ 10 10 ] = [ 00 00 ]
Theorem 3.12. The following statements are true for any ring A.
1. a0 = 0a for any a ∈ A.
Proof. Exercise.
3.19 Exercises
1. Prove that Zn , n ≥ 2, is an integral domain iff n is a prime.
8. Prove that any ring A with identity element and cardinality p, where p is
a prime, is commutative. (Hint: Verify that the elements 1, 1 + 1, . . . , 1 +
1 + · · · + 1 (p times) are all distinct elements of A.)
Algebraic Structures I (Matrices, Groups, Rings, and Fields) 183
3.20 Ideals
One of the special classes of subrings of a ring is the class of ideals. Con-
sider, for example, the set S of all multiples of 3 (positive multiples, negative
multiples, and zero multiple) in the ring Z of integers. Then, it is easy to see
that S is a subring of Z. More than this, if n ∈ Z and a ∈ S, then na ∈ S
as na is also a multiple of 3. We then call S an ideal in Z. We now present
the formal definition of an ideal in a general ring (that is, not necessarily
commutative) A.
i. S is a subring of A, and
Example 3.7
Let A be the ring of all 2 × 2 matrices over Z. (Note: A is a non-
commutative ring). Let
a 0
S= , a, b ∈ Z .
b 0
Then, as
a 0 c 0 ac 0
= ,
b 0 d 0 bc 0
it is easy to see that S is a subring of A. Moreover, if
a 0 x y
∈ S, and ∈ A,
b 0 z t
we have
x y a 0 xa + yb 0
= ∈ S.
z t b 0 za + tb 0
Hence S is a left ideal of A. However, S is not a right ideal of A. For
instance,
5 0 5 0 1 2 5 10
∈ S, but = ∈
/ S.
6 0 6 0 3 4 6 12
184 Discrete Mathematics
Example 3.8
Let A = Z[x], the ring of all polynomial in x with integer coefficients.
Let S =
2, x, the ideal generated by 2 and x in A = smallest ideal
containing 2 and x in A = Set of all integer polynomials in x with even
constant terms. (Note that the integer 0 is also an integer polynomial.
Also remember that A is a commutative ring.)
Example 3.9
Let A = Z and B = Z5 consisting of integers modulo 5. Hence Z5 =
{0, 1, 2, 3, 4} where addition and multiplication are taken modulo 5. (For
example, 2 + 3 = 0, and 2.3 = 1 in Z5 .) Clearly Z5 is a ring. Now
consider the map f : Z → Z5 defined by f (a) = a0 where a0 ∈ Z5 , and
a ≡ a0 (mod 5). Then, f is a ring homomorphism.
In the above example, what are all the elements that are mapped to 3?
They are all the numbers n ≡ 3(mod 5) in Z. We denote this set by [3], where
[3] = {. . . , −12, −7, −2, 3, 8, 13, . . . }. We call this set the residue class mod-
ulo 5 defined by 3 in Z. Note that [3] = [8], etc. Hence Z5 = {[0], [1], [2], [3], [4]},
where [m] = [n] iff m ≡ n(mod 5) for any integers m and n iff m − n ∈ (5),
the ideal generated by 5 in Z. The ring Z5 is often referred to as the residue
class ring or quotient ring modulo the ideal (5).
More generally, let S be an ideal in a commutative ring A, and let AS
denote the set of residue classes modulo the ideal S, that is, for a ∈ A, the
residue class defined by a ∈ A, namely [a] = {a + s, s ∈ S} then AS is the
residue class ring or quotient ring defined by the ideal S is the ring A.
1. S is an ideal of A, and
Definition 3.30. A principal ideal ring (or more specifically, a principal ideal
domain (P.I.D.)) is a commutative ring A without zero divisors and with unit
element 1 (that is, an integral domain) and in which every ideal is principal.
Example 3.10
The ring Z of integers is a PID. This is because first Z is an integral
domain. Suppose S is an ideal of A. If S = (0), the zero ideal, S is
principal. So assume that S = 0. If a ∈ S, and a = 0, (−1)a = −a ∈ S.
Of the two numbers a and −a, one is a positive integer. Let s be the
least positive integer belonging to S.
Claim: S =
s, the ideal generated by s. Let b be any element of S.
We have to show that b ∈
s, that is, b is an integral multiple of s. By
division algorithm in Z,
b = qs + r, 0 r < s.
Example 3.11
Let F be a field. Then, the ring F [x] of polynomials in x with coef-
ficients in F is a P.I.D.
We now imitate the proof of Example 3.10. Let a(x) be any poly-
nomial of F [x] in S. Divide a(x) by s(x) by Euclidean algorithm. This
gives
where either r(x) = 0 or deg r(x) < deg s(x). As a(x), s(x) are in S,
q(x)s(x) ∈ (a), and a(x) − q(x)s(x) = r(x) ∈ S. By the choice of s(x),
we should have r(x) = 0 ⇒ a(x) = q(x)s(x) ⇒ every polynomial in S is
a multiple of s(x) in F [x] ⇒ S is a principal ideal in F [x] ⇒ F [x] is a
principal ideal domain.
Example 3.12
The ring Z[x], though an integral domain, is not a P.I.D. The ideal
S =
2, x in Z[x] which consists of all polynomials in x with even
constant terms is not a principal ideal. For, suppose S is a principal
ideal in Z[x]. Let S =
a(x). As 2 ∈ S, S must be equal to
2, since
otherwise 2 cannot be a multiple of a non-constant polynomial. But
then every polynomial in S must have even integers as coefficients, a
contradiction. (For instance 2 + 3x ∈ S). Hence Z[x] is not a P.I.D.
Proof.
Corollary 3.3. Let F be a field and let φ : F [x] → F [x]/(xn − 1) be the ring
homomorphism defined by:
3.22 Fields
We now discuss the fundamental properties of fields and then go on to
develop in the next chapter the properties of finite fields that are basic to
coding theory and cryptography. If rings are algebraic abstractions of the
set of integers, fields are algebraic abstractions of the sets Q, R and C (as
mentioned already).
Definition 3.32. A field is a commutative ring with identity element in which
every non-zero element is a unit.
Every field is an integral domain. To see this, all we have to verify is that
F has no zero divisors. Indeed, if ab = 0, a = 0, then as a−1 exists in F , we
have 0 = a−1 (ab) = (a−1 a)b = b in F . However, not every integral domain is a
field. For instance, the ring Z of integers is an integral domain but not a field.
(Recall that the only non-zero integers which are units in Z are 1 and −1.)
Let F be a field whose zero and identity elements are denoted by 0F and
1F , respectively. A subfield of F is a subset F of F such that F is also a
field with the same addition and multiplication operations of F . This of course
means that the zero and unity elements of F are the same as those of F . It is
clear that the intersection of any family of subfields of F is again a subfield of
F . Let P denote the intersection of the family of all subfields of F . Naturally,
this subfield P is the smallest subfield of F . Because if P is a subfield of
F that is properly contained in P , then P ⊂ P ⊂ P , a contradiction. This
=
smallest subfield P of F is called the prime field of F . Necessarily, 0F ∈ P
and 1F ∈ P .
As 1F ∈ P , the elements 1F , 1F + 1F = 2 · 1F , 1F + 1F + 1F = 3 · 1F and, in
general, n · 1F , n ∈ N, all belong to P . There are then two cases to consider:
Algebraic Structures I (Matrices, Groups, Rings, and Fields) 189
Case 1: The elements n · 1F , n ∈ N, are all distinct. In this case, the subfield
P itself is an infinite field and therefore F is an infinite field.
Case 2: The elements n·1F , n ∈ N, are not all distinct. In this case, there exist
r, s ∈ N with r > s such that r·1F = s·1F , and therefore, (r−s)·1F = 0, where
r−s is a positive integer. Hence, there exists a least positive integer p such that
p·1F = 0. We claim that p is a prime number. If not, p = p1 p2 , where p1 and p2
are positive integers less than p. Then, 0 = p·1F = (p1 p2 )·1F = (p1 ·1F )(p2 ·1F )
gives, as F is a field, either p1 · 1F = 0 or p2 · 1F = 0. But this contradicts the
choice of p. Thus, p is prime.
A field of characteristic zero is necessarily infinite (as its prime field already
is). A finite field is necessarily of prime characteristic. However, there are
infinite fields with prime characteristic. Note that if a field F has characteristic
p, then px = 0 for each x ∈ F .
iii. For a field F , denote by F [X] the set of all polynomials in X over F , that
is, polynomials whose coefficients are in F . F [X] is an integral domain
and the group of units of F [X] = F ∗ , the set of all non-zero elements
of F .
iv. The field Zp (X) of rational functions of the form a(X)/b(X), where
a(X) and b(X) are polynomials in X over Zp , p being a prime, and
b(X) = 0, is an infinite field of (finite) characteristic p.
So assume that
n n n
(x + y)p = xp + y p , n ∈ N.
p
n+1 pn
Then, (x + y)p = (x + y)
n
n p
= xp + y p (by induction assumption)
n n
= (xp )p + (y p )p (by Equation 3.3)
pn+1 pn+1
=x +y . (3.4)
n
Next we consider (x − y)p . If p = 2, then −y = y and so the result is valid.
If p is an odd prime, change y to −y in Equation 3.4. This gives
n n n
(x − y)p = xp + (−y)p
n n n
= xp + (−1)p y p
n n
= xp − y p ,
n
since (−1)p = −1 when p is odd.
Chapter 4
Algebraic Structures II
(Vector Spaces and Finite Fields)
Definition 4.1. A vector space (or linear space) V over a field F is a non-
void set V whose elements satisfy the following axioms:
191
192 Discrete Mathematics
dn y dn−1 y
+ C1 + · · · + Cn−1 y = 0. (4.1)
dxn dxn−1
Clearly, if y1 (x) and y2 (x) are two solutions of the differential Equa-
tion 4.1, then so is y(x) = α1 y1 (x) + α2 y2 (x), α1 , α2 ∈ R. It is now easy
to verify that the axioms of a vector space are all satisfied.
4.2 Subspaces
The notion of a subspace of a vector space is something very similar to the
notions of a subgroup, subring, and subfield.
Example 4.1
We shall determine the smallest subspace W of R3 containing the
vectors (1, 2, 1) and (2, 3, 4).
Clearly, W must contain the subspace spanned by (1, 2, 1), that is,
the line joining the origin (0, 0, 0) and (1, 2, 1). Similarly, W must also
194 Discrete Mathematics
contain the line joining (0, 0, 0) and (2, 3, 4). These two distinct lines
meet at the origin and hence define a unique plane through the origin,
and this is the subspace spanned by the two vectors (1, 2, 1) and (2, 3, 4).
(See Proposition 4.3 below.)
u = α1 s1 + · · · + αr sr , and
v = β1 s1 + · · · + βt st
u1 , . . . , un ⊆ u1 , . . . , un ; v
Algebraic Structures II (Vector Spaces and Finite Fields) 195
Conversely, if
w = α1 u1 + · · · + αn un + βv ∈ u1 , . . . , un ; v,
w = (α1 u1 + · · · + αn un ) + β (γ1 u1 + · · · + γn un )
n
= (αi + βγi ) ui ∈ u1 , . . . , un .
i=1
α1 v1 + α2 v2 + · · · + αn vn = 0, αi ∈ F
0 · v1 + 0 · v2 + · · · + 0 · vn = 0.
In this case we also say that the vectors v1 , . . . , vn are linearly indepen-
dent over F . In the above equation, the zero on the right refers to the
zero vector of V , while the zeros on the left refer to the scalar zero, that
is, the zero element of F .
196 Discrete Mathematics
i. The zero vector of V forms a linearly dependent set since it satisfies the
non-trivial equation 1 · 0 = 0, where 1 ∈ F and 0 ∈ V .
ii. Two vectors of V are linearly dependent over F iff one of them is a
scalar multiple of the other.
α·1+β·i=0
Example 4.2
The vectors e1 = (1, 0, 0), e2 = (0, 1, 0) and e3 = (0, 0, 1) form a
basis for R3 . This follows from the following two facts.
1. {e1 , e2 , e3 } is linearly independent in R3 . In fact, α1 e1 + α2 e2 +
α3 e3 = 0, αi ∈ R, implies that
and hence αi = 0, 1 ≤ i ≤ 3.
2. e1 , e2 , e3 = R3 . To see this, any vector of e1 , e2 , e3 is of the
form α1 e1 + α2 e2 + α3 e3 = (α1 , α2 , α3 ) ∈ R3 and, conversely, any
(α1 , α2 , α3 ) in R3 is α1 e1 + α2 e2 + α3 e3 and hence belongs to
e1 , e2 , e3 .
α1 v1 + · · · + αk vk = 0.
Lemma 4.2 implies, by Proposition 4.4, that under the stated conditions
on vk ,
∧
v1 , . . . , vk , . . . , vn = v1 , . . . , vk−1 , vk+1 , . . . , vn = v1 , . . . , vk , . . . , vn ,
where the ∧ symbol upon vk indicates that the vector vk should be deleted.
We next prove a very important property of finite-dimensional vector
spaces.
∧ ∧
S3 = {v1 , v2 ; u1 , . . . , ui1 , . . . ui2 , . . . , um }
= {v1 , v2 ; u1 , . . . , ui1 , . . . ui2 , . . . , um } \ ui1 , ui2 ,
S3 = V.
Note that we have actually shown that any finite spanning subset of a
finite-dimensional vector space V does indeed contain a finite basis of V .
Theorem 4.1 makes the following definition unambiguous.
S = {e1 , . . . , en ; f1 , . . . , fn }
over Q. Hence
iii. Ri +cRj : addition to the i-th row of A, c times the j-th row of A, c being
a scalar.
Note that from our earlier discussions, rank of A = row rank of A, = col-
umn rank of A. Consequently, a set of n vectors of Rn is linearly independent
iff the determinant formed by them is not zero; equivalently, the n × n matrix
formed by them is invertible.
Example 4.3
Find the row-reduced echelon form of
⎡ ⎤
1 2 3 −1
⎢2 1 −1 4 ⎥
A=⎢ ⎣3 3 2
⎥.
3⎦
6 6 4 6
Algebraic Structures II (Vector Spaces and Finite Fields) 203
Remark 4.2. Since the last three rows of A1 are proportional (that is, one
row is a multiple of the other two), any 3 × 3 submatrix of A1 will be singular.
Since A1 has a non-singular submatrix of order 2, (for example, 10 −3 2
), A1
is of rank 2 and we can conclude that A is also of rank 2.
Remark 4.3. The word “echelon” refers to the formation of army troops in
parallel divisions each with its front clear of that in advance.
4.8 Exercises
1. Show that Z is not a vector space over Q.
204 Discrete Mathematics
0 0 0 0 0 0
,
0 1 0 0 0 1
form a basis for the space of all 2 by 3 real matrices. Verify this first.]
X1 + 2X2 + 3X3 − X4 = 0
2X1 + X2 − X3 + 4X4 = 0
3X1 + 3X2 + 2X3 + 3X4 = 0 (4.2)
6X1 + 6X2 + 4X3 + 6X4 = 0.
we state it as a theorem.
AX = B,
X 1 − X2 + X3 = 2
X1 + X 2 − X 3 = 0
3X1 = 6.
From the last equation, we get X1 = 2. This, when substituted in the first
two equations, yields −X2 + X3 = 0, X2 − X3 = −2 which are mutually
contradictory. Such equations are called inconsistent equations.
When are the equations represented by AX = B consistent?
where ⎡ ⎤
a21
⎢ ⎥
v = ⎣ ... ⎦ and wt = (a12 . . . a1n ).
an1
where
1 0 a11 wt
L= , and U= .
v/a11 L 0 U
The validity of the two middle equations on the right of Equation 4.6 can be
verified by routine block multiplication of matrices (cf: Section 3.3.1). This
method is based on the supposition that a11 and all the leading entries of
the successive Schur complements are all non-zero. If a11 is zero, we inter-
change the first row of A with a subsequent row having a non-zero first entry.
This amounts to premultiplying both sides by the corresponding permutation
matrix P yielding the matrix P A on the left. We now proceed as with the case
when a11 = 0. If a leading entry of a subsequent Schur complement is zero,
once again we make interchanges of rows—not just the rows of the relevant
Schur complement, but the full rows obtained from A. This again amounts to
premultiplication by a permutation matrix. Since any product of permutation
matrices is a permutation matrix, this process finally ends up with a matrix
P A, where P is a permutation matrix of order n.
We now present two examples, one to obtain the LU decomposition when
it is possible and another to determine the LUP decomposition.
210 Discrete Mathematics
Example 4.4
Find the LU decomposition of
⎡ ⎤
2 3 1 2
⎢4 7 4 7⎥
A=⎢ ⎣2 7 13 16⎦ .
⎥
6 10 13 15
Here, ⎡ ⎤
4
a11 = 2, v = ⎣2⎦ , wt = [3, 1, 2] .
6
Therefore
⎡ ⎤ ⎡ ⎤
2 6 2 4
v/a11 = ⎣1⎦ , and so vwt /a11 = ⎣3 1 2⎦ ,
3 9 3 6
This gives, ⎡ ⎤⎡ ⎤
1 0 0 1 2 3
A1 = ⎣ 4 1 0 ⎦⎣ 0 4 2 ⎦.
1 2 1 0 0 2
Consequently,
⎡ ⎤⎡ ⎤
1 0 0 0 2 3 1 2
⎢ 2 1 0 ⎥ ⎢
0 ⎥⎢ 0 1 2 3 ⎥
A=⎢
⎣ 1
⎥ = LU,
4 1 0 ⎦⎣ 0 0 4 2 ⎦
3 1 2 1 0 0 0 2
where
⎡ ⎤
1 0 0 0
⎢ 2 1 0 0 ⎥
⎥
L=⎢ is unit lower-triangular, and
⎣ 1 4 1 0 ⎦
3 1 2 1
⎡ ⎤
2 3 1 2
⎢ 0 1 2 3 ⎥
U =⎢
⎣ 0
⎥ is upper-triangular.
0 4 2 ⎦
0 0 0 2
Example 4.5
Find the LUP decomposition of
⎡ ⎤
2 3 1 2
⎢4 6 4 7⎥
A=⎢ ⎣2
⎥.
7 13 16⎦
6 10 13 15
Suppose we proceed as before: The Schur complement of A is
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
6 4 7 2
6 4 7 6 2 4
A1 = ⎣ 7 13 16⎦ − ⎣1⎦ 3 1 2 = ⎣ 7 13 16⎦ − ⎣3 1 2⎦
10 13 15 3 10 13 15 9 3 6
⎡ ⎤
0 2 3
= ⎣4 12 14⎦ .
1 10 9
212 Discrete Mathematics
Since the leading entry is zero, we interchange the first row of A1 with
some other row. Suppose we interchange the first and third rows of A1 .
This amounts to considering the matrix P A instead of A1 , where
⎡ ⎤
1 0 0 0
⎢0 0 0 1⎥
P =⎢ ⎣0 0 1 0⎦ .
⎥
0 1 0 0
Note that the first row of A1 corresponds to the second row of A and
the last row of A1 to the fourth row of A. This means that the Schur
complement of P A (instead of A) is
⎡ ⎤
1 10 9
A1 = ⎣4 12 14⎦ .
0 2 3
This gives ⎡ ⎤⎡ ⎤
1 0 0 1 10 9
A1 = ⎣4 1 0⎦ ⎣0 −28 −22⎦ .
−1 10
0 14 1 0 0 7
Thus, ⎡ ⎤⎡ ⎤
1 0 0 0 2 3 1 2
⎢3 1 0 0⎥ ⎢0 1 10 9 ⎥
⎢ ⎥⎢ ⎥ = L U,
⎣1 4 1 0⎦ ⎣0 0 −28 −22⎦
−1 10
2 0 14 1 0 0 0 7
where L and U are the first and second matrices in the product. Notice
that we have interchanged the second and fourth rows of A while com-
puting L . Interchanging the second and fourth rows of L , we get
⎡ ⎤⎡ ⎤
1 0 0 0 2 3 1 2
⎢2 0 −1 1⎥ ⎢0 1 10 9 ⎥
A=⎢ 14 ⎥⎢
⎣1 4 1 0⎦ ⎣0 0 −28 −22⎦ = LU.
⎥
10
3 1 0 0 0 0 0 7
Algebraic Structures II (Vector Spaces and Finite Fields) 213
Example 4.6
Solve the system of linear equations using the LUP decomposition
method:
This gives ⎡ ⎤
1 0 0 0 ⎡Y ⎤ ⎡10⎤
1
⎢ −1 ⎥ ⎥ ⎢50⎥
⎢2 0 1⎥ ⎢ Y2⎥ ⎢ ⎥
⎢ 14 ⎥⎢⎣ ⎦ = ⎣40⎦ .
⎣1 4 1 0⎦ Y3
3 1 0 0 Y4 25
3Y1 = 10
1
2Y1 − Y3 + Y4 = 25
14
Y1 + 4Y2 + Y3 = 40
3Y1 + Y2 = 50,
4.12 Exercises
1. Examine if the following equations are consistent.
X1 + X2 + X3 + X4 = 0
2X1 − X2 + 3X3 + 4X4 = 1
3X1 + 4X3 + 5X4 = 2.
Algebraic Structures II (Vector Spaces and Finite Fields) 215
3. Solve:
X1 + X2 + X3 + X4 = 0
X1 + 3X2 + 2X3 + 4X4 = 0
2X1 + X3 − X4 = 0.
ii.
3X1 − 2X2 + X3 = 7
X1 + X2 + X3 = 12
−X1 + 4X2 − X3 = 3.
iii.
Finite fields are known as Galois fields after the French mathematician
Évariste Galois (1811–1832) who first studied them. A finite field of order q
is denoted by GF (q).
We now look at the converse of Theorem 4.5. Given a prime power pn
(where p is a prime), does there exist a field of order pn ? The answer to this
question is in the affirmative. We give below two different constructions that
yield a field of order pn .
Theorem 4.6. Given pn (where p is a prime), there exists a field of pn
elements.
n
Construction 1: Consider the polynomial X p − X ∈ Zp [X] of degree pn .
(Recall that Zp [X] stands for the ring of polynomials in X with coefficients
from the field Zp of p elements.) The derivative of this polynomial is
n
pn X p −1
− 1 = −1 ∈ Zp [X],
n n
and is therefore relatively prime to X p − X. Hence the pn roots of X p − X
are all distinct. (Here, though no concept of the limit is involved, the notion
of the derivative has been employed as though it is a real polynomial.) It is
known [12] that the roots of this polynomial lie in an extension field K ⊃ Zp .
n
K is also of characteristic p. If a and b are any two roots of X p − X, then
n n
ap = a, and bp = b.
Now by Theorem 3.14 n n n
(a ± b)p = ap ± bp ,
and, by the commutativity of multiplication in K,
n n n
ap bp = (ab)p ,
n
and so a ± b and ab are also roots of X p − X. Moreover, if a is a non-zero
n n n
root of X p − X, then so is a−1 since (a−1 )p = (ap )−1 = a−1 . Also the
associative and distributive laws are valid for the set of roots since they are
n
all elements of the field K. Finally 0 and 1 are also roots of X p − X. In
n pn
other words, the p roots of the polynomial X − X ∈ Zp [X] form a field of
order pn .
Algebraic Structures II (Vector Spaces and Finite Fields) 217
is pn .
We now show that F is a field. Clearly, F is a commutative ring with unit
element 1(= 0 · X n−1 + · · · + 0 · X + 1). Hence we need only verify that if
a(X) ∈ F is not zero, then there exists b(X) ∈ F with a(X)b(X) = 1. As
a(X) = 0, and f (X) is irreducible over Zp , the gcd (a(X), f (X)) = 1. So by
Euclidean algorithm, there exist polynomials C(X) and g(X) in Zp [X] such
that
in Zp [X]. Now there exists C1 (X) ∈ F with C1 (X) ≡ C(X)(mod f (X)). This
means that there exists a polynomial h(X) in Zp [X] with C(X) − C1 (X) =
h(X)f (X), and hence C(X) = C1 (X)+h(X)f (X). Substituting this in Equa-
tion 4.8 and taking modulo f (X), we get, a(X)C1 (X) = 1 in F . Hence a(X)
has C1 (X) as inverse in F . Thus, every non-zero element of F has a multi-
plicative inverse in F , and so F is a field of pn elements.
Theorem 4.7. Any two finite fields of the same order are isomorphic under
a field isomorphism.
218 Discrete Mathematics
Example 4.7
Take p = 2 and n = 3. The polynomial X 3 + X + 1 of degree 3
is irreducible over Z2 . (If it is reducible, one of the factors must be of
degree 1, and it must be either X or X + 1 = X − 1 ∈ Z2 [X]. But 0 and
1 are not roots of X 3 + X + 1 ∈ Z2 [X].) The 23 = 8 polynomials over
Z2 reduced modulo X 3 + X + 1 are
0, 1, X, X + 1, X 2 , X 2 + 1, X 2 + X, X 2 + X + 1
a0 + a1 α + · · · + an−1 αn−1 ,
Example 4.8
Consider the polynomial X 4 + X + 1 ∈ Z2 [X]. This is irreducible
over Z2 (check that it can have no linear or quadratic factor in Z2 [X]).
Let α be a root (in an extension field of Z2 ) of this polynomial so that
α4 + α + 1 = 0. This means that α4 = α + 1.
We now prove that α is a primitive element of a field of 16 elements
over Z2 by checking that the 15 powers α, α2 , . . . , α15 are all distinct
and that α15 = 1. Indeed, we have
α1 = α
α2 = α2
α3 = α3
α4 = α + 1(as α4 + α + 1 = 0 → α4 = α + 1, as1 = −1)
α5 = αα4 = α(α + 1) = α2 + α
α6 = αα5 = α3 + α2
α7 = αα6 = α4 + α3 = α3 + α + 1
α8 = αα7 = α4 + (α2 + α) = (α + 1) + (α2 + α) = α2 + 1
α9 = αα8 = α3 + α
α10 = αα9 = α4 + α2 = α2 + α + 1
α11 = αα10 = α3 + α2 + α
α12 = αα11 = α4 + (α3 + α2 ) = (α + 1) + (α3 + α2 ) = α3 + α2 + α + 1
220 Discrete Mathematics
The sets Ci are called the cyclotomic cosets modulo p defined with respect to
F and α. Now, corresponding to the coset Ci , 0 ≤ i ≤ pn − 1, consider the
polynomial
2 t
fi (X) = (X − αi )(X − αi·p )(X − αi·p ) · · · (X − αi·p ).
t
The coefficients of fi are elementary symmetric functions of αi , αip , . . . , αip
and if β denotes any of these coefficients, then β satisfies the relation β p = β.
Hence β ∈ Zp and fi (X) ∈ Zp [X] for each i, 0 ≤ i ≤ pn −1. Each element of Ci
determines the same cyclotomic coset, that is, Ci = Cip = Cip2 = · · · = Cipt .
n
/ Ci , Ci ∩ Cj = φ. This gives a factorization of X p − X into
Moreover, if j ∈
Algebraic Structures II (Vector Spaces and Finite Fields) 221
n n
irreducible factors over Zp . In fact, X p − X = X(X p −1
− 1), and
n n
Xp −1
− 1 = (x − α)(X − α2 ) · · · (X − αp −1
)
⎛ ⎞
= ⎝ (X − αj )⎠ ,
i j∈Ci
where the first product is taken over all the distinct cyclotomic cosets. Fur-
thermore, each polynomial fi (X) is irreducible over Zp as shown below. To
see this, assume that
g(X) = a0 + a1 X + · · · + ak X k ∈ F [X].
p
Then, g(X) = ap0 + ap1 X p + · · · + apk (X k )p (Refer Section 3.23)
= a0 + a1 X p + · · · + ak X kp
= g(X p ).
p
Consequently, if β is a root of g, g(β) = 0, and therefore 0 = (g(β)) = g (β p ),
that is, β p is also a root of g(X). Hence if j ∈ Ci and αj is a root of fi (X),
then all the powers αk , k ∈ Ci , are roots of fi (X). Hence any non-constant
irreducible factor of fi (X) over Zp must contain all the terms (X − αj ), j ∈ Ci
as factors. In other words, g(X) is irreducible over Zp .
Thus, the determination of the cyclotomic cosets yields a simple device to
n
factorize X p − X into irreducible factors over ZP . We illustrate this fact by
an example.
Example 4.9
4
Factorize X 2 − X into irreducible factors over Z2 .
Let α be a primitive element of the field GF (24 ). As a primitive
polynomial of degree 4 over Z2 having α as a root, we can take (see
Example 4.8) X 4 + X + 1.
The cyclotomic cosets modulo 2 w.r.t. GF (24 ) and α are
C0 = {0}
C1 = 1, 2, 22 = 4, 23 = 8 (Note: 24 = 16 ≡ 1( mod 15))
C3 = {3, 6, 12, 9}
C5 = {5, 10}
C7 = {7, 14, 13, 11} .
222 Discrete Mathematics
The six factors on the right of Equation 4.9 are all irreducible over
Z2 . The minimal polynomials of α, α3 and α7 are all of degree 4 over
Z2 . However, while α and α7 are primitive elements of GF (24 ) (so that
the polynomials X 4 + X + 1 and X 4 + X 3 + 1 are primitive), α3 is not
(even though its minimal polynomial is also of degree 4).
4.15 Exercises
1. Construct the following fields:
GF (24 ), GF (25 ) and GF (32 ).
2. Show that GF (25 ) has no GF (23 ) as a subfield.
3 5
3. Factorize X 2 + X and X 2 + X over Z2 .
Algebraic Structures II (Vector Spaces and Finite Fields) 223
2
4. Factorize X 3 − X over Z3 .
5. Using Theorem 4.8, prove Fermat’s Little Theorem that for any prime p,
ap−1 ≡ 1 (mod p), if a ≡ 0 (mod p).
if n ≡ 2(mod 4) and n > 6, then there exists a pair of orthogonal Latin squares
of order n (see also [17,35]).
A set {L1 , . . . , Lt } of t Latin squares of order n on S is called a set of MOLS
(Mutually Orthogonal Latin squares) if Li and Lj are orthogonal whenever
i = j. It is easy to see [17] that the number t of MOLS of order n is bounded
by n − 1. Further, any set of n − 1 MOLS of order n is known to be equivalent
to the existence of a finite projective plane of order n [17]. A long standing
conjecture is that if n is not a prime power, then there exists no complete set
of MOLS of order n.
We now show that if n is a prime power, there exists a set of n − 1 MOLS
of order n. (Equivalently, this implies that there exists a projective plane of
any prime power order, though we do not prove this here) (see [17] for more
details on finite projective planes).
At = (atij ), 0 ≤ i, j ≤ n − 1; and 1 ≤ t ≤ n − 1,
atij = at ai + aj (here atij stands for the (i, j)-th entry of the matrix At ). The
entries atij are all elements of the field F . We claim that each At is a Latin
square. Suppose, for instance, two entries of some i-th row of At , say atij and
atil are equal. This implies that
at ai + aj = at ai + al ,
and hence aj = al . Consequently j = l. Thus, all the entries of the i-th row
of At are distinct. For a similar reason, no two entries of the same column of
At are equal. Hence At is a Latin square for each t.
We next claim that {A1 , . . . , An−1 } is a set of MOLS. Suppose 1 ≤ r <
u ≤ n − 1. Then, Ar and Au are orthogonal. For suppose that
r u r
aij , aij = ai j , aui j . (4.10)
ar ai + aj = ar ai + aj ,
and au ai + aj = au ai + aj .
Subtraction gives
(ar − au )ai = (ar − au )ai
and hence, as ar = au , ai = ai . Consequently, i = i and j = j . Thus, Ar
and Au are orthogonal.
Chapter 5
Introduction to Coding Theory
ASA subcommittee
5.1 Introduction
Coding theory has its origin in communication engineering. With Shan-
non’s seminal paper of 1948 [22], it has been greatly influenced by mathematics
with a variety of mathematical techniques to tackle its problems. Algebraic
coding theory uses a great deal of matrices, groups, rings, fields, vector spaces,
algebraic number theory and, not to speak of, algebraic geometry. In algebraic
coding, each message is regarded as a block of symbols taken from a finite
alphabet. On most occasions, this alphabet is Z2 = {0, 1}. Each message is
then a finite string of 0s and 1s. For example, 00110111 is a message. Usu-
ally, the messages get transmitted through a communication channel. It is
quite possible that such channels are subjected to noises, and consequently,
the messages get changed. The purpose of an error correcting code is to add
redundancy symbols to the message, based of course on some rule so that the
original message could be retrieved even though it is garbled. Each message
is also called a codeword and the set of codewords is a code.
Any communication channel looks as in Figure 5.1. The first box of the
channel indicates the message. It is then transmitted to the encoder, which
adds a certain number of redundancy symbols. In Figure 5.1, these redun-
dancy symbols are 001 which when added to the message 1101 give the coded
message 1101001. Because of channel noise, the coded message gets distorted
and the received message is 0101001. This message then enters the decoder.
The decoder applies the decoding algorithm and retrieves the coded message
using the added redundancy symbols. From this, the original message is read
off in the last box (see Figure 5.1). The decoder has thus corrected a single
error, that is, error in one place.
225
226 Discrete Mathematics
q
0 0
p p
1 1
q
If F has q elements, that is, F = GF (q), the [n, k]-code will have q k
codewords. The codewords of C are all of length n as they are n-tuples over F .
k is called the dimension of C. C is a binary code if F = Z2 .
A linear code C is best represented by any one of its generator matrices.
Clearly, all the three row vectors of G1 are linearly independent over Z2 . Hence
C1 has 23 = 8 codewords. The first three columns of G1 are linearly indepen-
dent over Z2 . Therefore, the first three positions of any codeword of C1 may
be taken as information positions, and the remaining two as redundancy posi-
tions. In fact, the positions corresponding to any three linearly independent
columns of G1 may be taken as information positions and the rest redundan-
cies. Now any word X of C1 is given by
X = x1 R1 + x2 R2 + x3 R3 , (5.1)
where x1 , x2 , x3 are all in Z2 and R1 , R2 , R3 are the three row vectors of G1
in order. Hence by Equation 5.1, X = (x1 , x2 , x3 , x1 + x2 , x1 + x3 ). If we take
X = (x1 , x2 , x3 , x4 , x5 ), we have the relations
x4 = x1 + x2 , and
(5.2)
x5 = x1 + x3 .
In other words, the first redundancy coordinate of any codeword is the sum
of the first two information coordinates of that word, while the next redun-
dancy coordinate is the sum of the first and third information coordinates.
Equations 5.2 are the parity-check equations of the code C1 . They can be
rewritten as
x1 + x2 − x4 = 0, and
(5.3)
x1 + x3 − x5 = 0.
In the binary case, Equations 5.3 become
x1 + x2 + x4 = 0, and
(5.4)
x1 + x3 + x5 = 0.
In other words, the vector X = (x1 , x2 , x3 , x4 , x5 ) ∈ C1 iff its coordinates
satisfy Equations 5.4. Equivalently, X ∈ C1 iff it is orthogonal to the two
vectors 11010 and 10101. If we take these two vectors as the row vectors of a
matrix H1 , then H1 is the 2 by 5 matrix:
1 1 0 1 0
H1 = .
1 0 1 0 1
H1 is called a parity-check matrix of the code C1 . The row vectors of H1 are
orthogonal to the row vectors of G1 . (Recall that two vectors X = (x1 , . . . , xn )
and Y = (y1 , . . . , yn ) of the same length n are orthogonal if their inner product
(= scalar product X, Y = x1 y1 + · · · + xn yn ) is zero). Now if a vector v is
orthogonal to u1 , . . . , uk , then it is orthogonal to any linear combination of
u1 , . . . , uk . Hence the row vectors of H1 , which are orthogonal to the row
vectors of G1 , are orthogonal to all the vectors of the row space of G1 , that
is, to all the vectors of C1 . Thus,
C1 = X ∈ Z52 : H1 X t = 0 = Null space of the matrix H1 ,
where X t is the transpose of X. The orthogonality relations H1 X t = 0 give the
parity-check conditions for the code C1 . These conditions fix the redundancy
Introduction to Coding Theory 229
positions, given the message positions of any codeword. A similar result holds
good for any linear code. Thus, any linear code over a field F is either the row
space of one of its generator matrices or the null space of its corresponding
parity-check matrix.
Note that if we form any k linear combinations of the generator matrix
of a linear code which are also linearly independent over the base field, the
resulting k words also form a basis for C. For instance, if
1 0 1 1
G=
0 1 1 1
is a generator matrix of a binary linear code C of length 4, the matrix
1 0 1 1
G =
1 1 0 0
is also a generator matrix of C. The reason is that every row vector of G ∈ C
and rank(G) = rank(G ) = 2.
So far, we have been considering binary linear codes. We now consider
linear codes over an arbitrary finite field F . As mentioned in Definition 5.2,
an [n, k] linear code C over F is a k-dimensional subspace of F n , the space of
all ordered n-tuples over F . If {u1 , . . . , uk } is a basis of C over F , every word
of C is a unique linear combination
α1 u1 + · · · + αk uk , αi ∈ F for each i.
Since αi can take q values for each i, 1 ≤ i ≤ k, C has q · q · · · q(k times) = q k
codewords.
Let G be the k by n matrix over F having u1 , . . . , uk of F n as its row
vectors. Then, as G has k (= dimension of C) rows and all the k rows form a
linearly independent set over F , G is a generator matrix of C. Consequently,
C is the row space of G over F . The null space of C is the space of vectors
X ∈ F n which are orthogonal to all the words of C. In other words, it is the
dual space C ⊥ of C. As C is of dimension k over F , C ⊥ is of dimension n − k
over F . Let {X1 , . . . , Xn−k } be a basis of C ⊥ over F . If H is the matrix whose
row vectors are X1 , . . . , Xn−k , then H is a parity-check matrix of C. It is an
(n − k) by n matrix. Thus,
C = row space of G
= null space of H
= X ∈ F n : HX t = 0 .
Theorem 5.1. Let G = (Ik |A) be a generator matrix of a linear code C over
F , where Ik is the identity matrix of order k over F , and A is a k by (n − k)
matrix over F . Then, a generator matrix of C ⊥ is given by
H = −At |In−k
over F .
230 Discrete Mathematics
Proof. Each row of H is orthogonal to all the rows of G since (by block
multiplication, see Chapter 3, Section 3.3.1),
−A
GH t = [Ik |A] = −A + A = 0.
In−k
Example 5.1
As an example, consider the binary code C2 with generator matrix
1 0 1 1 0
G2 = = [I2 |A] .
0 1 1 0 1
Definition 5.5. Let X, Y ∈ F n . The distance d(X, Y ), also called the Ham-
ming distance between X and Y , is defined to be the number of places in which
X and Y differ. Accordingly, d(X, Y ) = wt(X − Y ).
Introduction to Coding Theory 231
d(X, Y ) = wt (X − Y ). (5.5)
Thus, for the linear code C2 of Example 5.1, the minimum distance is 3.
The function d(X, Y ) defined in Definition 5.5 does indeed define a distance
function (that is, a metric) on F n . That is to say, it has the following three
properties:
For all X, Y, Z in F n ,
We now give another interpretation for the minimum weight of a linear code
C over F∗q
us start by defining the [7, 4]-Hamming code H3 . The seven column vectors of
its parity-check matrix H of H3 are the binary representations of the numbers
1 to 7 written in such a way that the last three of its column vectors form I3 ,
the identity matrix of order 3. Thus,
⎡ ⎤
1 1 1 0 1 0 0
H = ⎣ 1 1 0 1 0 1 0 ⎦.
1 0 1 1 0 0 1
The columns of H are the binary representations of the numbers 7, 6, 5, 3; 4,
2, 1, respectively (7 = 22 + 21 + 20 etc.,). As H is of the form [−At |I3 ], the
generator matrix of H3 is given by
⎡ ⎤
1 0 0 0 1 1 1
⎢ 0 1 0 0 1 1 0 ⎥
G = [I7−3 |A] = [I4 |A] = ⎢
⎣ 0 0 1 0 1 0 1 ⎦.
⎥
0 0 0 1 0 1 1
H3 is of length 23 − 1 = 7, and dimension 4 = 7 − 3 = 23 − 1 − 3. What is
the minimum distance of H3 ? One way of finding it is to list all the 24 − 1
non-zero codewords (see Theorem 5.2). However, a better way of determining
it is the following. The first row of G is of weight 4, while the remaining rows
are of weight 3. The sum of any two or three of these row vectors as well as
the sum of all the four row vectors of G are all of weight at least 3. Hence the
minimum distance of H3 is 3.
0000000 1000111 0100110 0010101 0001011 1100001 1010010 1001100 0110011 0101101 0011010 0111000 1011001 1101010 1110100 1111111
1000000 0000111 1100110 1010101 1001011 0100001 0010010 0001100 1110011 1101101 1011010 1111000 0011001 0101010 0110100 0111111
0100000 1100111 0000110 0110101 0101011 1000001 1110010 1101100 0010011 0001101 0111010 0011000 1111001 1001010 1010100 1011111
0010000 1010111 0110110 0000101 0011011 1110001 1000010 1011100 0100011 0111101 0001010 0101000 1001001 1111010 1100100 1101111
0001000 1001111 0101110 0011101 0000011 1101001 1011010 1000100 0111011 0100101 0010010 0110000 1010001 1100010 1111100 1110111
0000100 1000011 0100010 0010001 0001111 1100101 1010110 1001000 0110111 0101001 0011110 0111100 1011101 1101110 1110000 1111011
0000010 1000101 0100100 0010111 0001001 1100011 1010000 1001110 0110001 0101111 0011000 0111010 1011011 1101000 1110110 1111101
0000001 1000110 0100111 0010100 0001010 1100000 1010011 1001101 0110010 0101100 0011011 0111001 1011000 1101011 1110101 1111110
Introduction to Coding Theory
the error vector. If we assume that v has one or no error, then e is of weight
1 or 0. Accordingly e is a coset leader of the standard array. Hence to get
u from v, we subtract e from v. In the binary case (as −e = e), u = v + e.
For instance, if in Figure 5.3, v = 1100110, then v is present in the second
coset for which the leader is e = 1000000. Hence the message is u = v + e =
0100110. This incidentally shows that H3 can correct single errors. However,
if for instance, u = 0100110 and v = 1000110, then e = 1100000 is of weight 2
and is not a coset leader of the standard array of Figure 5.3. In this case, the
standard array decoding of H3 will not work as it would wrongly decode v as
1000110 − 0000001 = 1000111 ∈ H3 . (Notice that v is present in the last row
of Figure 5.3). The error is due to the fact that v has two errors and not just
one. Standard array decoding is therefore maximum likelihood decoding.
The general Hamming code Hm is defined analogous to H3 . Its parity-check
matrix H has the binary representations of the numbers 1, 2, . . . , 2m − 1 as
its column vectors. Each such vector is a vector of length m. Hence H is an
m by 2m − 1 binary matrix and the dimension of Hm is (2m − 1) − m =
(number of columns in H) − (number of rows in H). In other words, Hm is a
[2m − 1, 2m − 1 − m] linear code over Z2 . Notice that H has rank m since H
contains Im as a submatrix.
The minimum distance
of Hm is 3, m ≥ 2. This can be seen as follows.
m
Recall that Hm = X ∈ Z22 −1 : HX t = 0 . Let i, j, k denote respectively
the numbers of the columns of H in which the m-tuples (that is, vectors of
length m) 0 . . . 011, 00 . . . 0101 and 0 . . . 0110 are present (Figure 5.4).
Let v be the binary vector of length 2m −1 which has 1 in the i-th, j-th and
k-th positions and zero at other positions. Clearly, v is orthogonal to all the
row vectors of H and hence belongs to Hm . Hence Hm has a word of weight
3. Further, Hm has no word of weight 2 or 1. Suppose Hm has a word u of
weight 2. Let i, j be the two positions where u has 1. As HuT = 0, by the
rule of matrix multiplication, Ci + Cj = 0, where Ci , Cj are the i-th and j-th
columns of H, respectively. This means that Ci = −Cj = Cj (the code being
binary). But this contradicts the fact that the columns of H are all distinct.
For a similar reason, Hm has no codeword of weight 1. Indeed, assume that
Hm has a codeword v of weight 1. Let v have 1 in its, say, k-th position and
0 elsewhere. As v ∈ Hm , Hv = 0, and this gives that the k-th column of H is
⎡ ⎤
··· 0 ··· 0 ··· 0 ···
⎢ .. .. .. ⎥
⎢ . . . ⎥
⎢ ⎥
H:⎢ 0 1 1 ⎥
⎢ ⎥
⎣· · · 1 · · · 0 · · · 0 · · ·⎦
··· 1 ··· 1 ··· 0 ···
S(X, r) = {Y ∈ F n : d(X, Y ) ≤ r} ⊆ F n .
z
u v
Proof. As d = 3 for Hm ,
(d − 1)/2 = 1. Now apply Theorem 5.4.
Lemma 5.1. Let w denote the weight function of a binary code C. Then,
Theorem 5.6. Suppose d is odd. Then, a binary [n, k]-linear code with dis-
tance d exists iff a binary [n + 1, k]-linear code with distance d + 1 exists.
Theorem 5.7 shows that the syndromes of all the vectors of F n are deter-
mined by the syndromes of the coset leaders of the standard array of C. In
case C is an [n, k]-binary linear code, there are 2n−k cosets and therefore the
number of distinct syndromes is 2n−k . Hence in contrast to standard-array
decoding, it is enough to store 2n−k vectors (instead of 2n vectors) in the
syndrome decoding. For instance, if C is a [100, 30]-binary linear code, it is
enough to store the 270 syndromes instead of the 2100 vectors in Z100
2 , a huge
saving indeed.
Definition 5.9. Let C be any code, not necessarily linear. Then, C is t-error
detecting, if any word of C incurs k errors, 1 ≤ k ≤ t, then the resulting word
does not belong to C.
Theorem 5.9. The number of points that a (closed) sphere of radius t in the
space Fqn of n-tuples over the Galois field Fq of q elements is
n n n n
+ (q − 1) + (q − 1)2 + · · · + (q − 1)t .
0 1 2 t
Corollary 5.3. (Sphere-packing bound or Hamming bound.) An (n, M,
2t + 1)-linear code C over GF (q) satisfies the condition
n n n
M (q − 1) + (q − 1)2 + · · · + (q − 1)t ≤ q n . . . (∗)
0 2 t
Proof. As the minimum distance of C is d = 2t+1, the closed spheres of radius
t = (d − 1)/2 about the codewords are pairwise disjoint. The total number of
points of Fnq that belong to these spheres is, by virtue of Theorem 5.9, the
expression on the LHS of (*). Certainly, this number cannot exceed the total
number of points of Fnq which is q n . This proves the result.
Corollary 5.4. If the (n, M, 2t + 1)-code is binary, then the sphere packing
bound is given by
n n n
M + + ··· + ≤ 2n . . . (∗∗)
0 1 t
Proof. Take q = 2 in Corollary 5.3.
Consider again the binary case (that is, when q = 2), and let n = 6. Then,
|Z62 | = 26 = 64. We ask: Can an (n, M, d) = (6, 9, 3), code exist in Z62 so that
the closed unit spheres with centers at the 9 codewords contain all the 64
vectors of Z62 ? The answer is “no” since the nine spheres
with centers
at
each
of the nine vectors can contain at the most M [ n0 + n1 ] = 9[ 60 + 61 ] = 63
vectors and hence not all of the 64 vectors of Z62 . This leads us to the concept
of a perfect code.
Definition 5.10. An (n, M, d)-code over the finite field Fq is called perfect if
the spheres of radius
(d − 1)/2 with the centers at the M codewords cover
all the q n vectors of length n over Fq .
In other words, an (n, M, d)-code is perfect iff equality is attained in (*)
(of Corollary 5.3).
240 Discrete Mathematics
Our earlier results show that no binary (6, 9, 3) perfect code exists. (See
also Exercise 9 at the end of this section). We now look at the general Ham-
ming code C of length m over the field GF (q). The column vectors of any
parity-check matrix of C consists of non-zero vectors of length m over GF (q).
Now the space spanned by any such vector v over GF (q) is the same as the
space spanned by αv for any non-zero element α of GF (q), that is, v = αv
for 0 = α ∈ GF (q). Hence we choose only one of these q − 1 vectors as a
column vector of H. As there are q m − 1 non-zero vectors of length m over
GF (q), we have (q m − 1)/(q − 1) distinct vectors of length m, no two of which
spanning the same subspace over GF (q). Hence the number of columns of the
parity-check matrix of this generalized Hamming code is (q m − 1)/(q − 1).
Note: When q = 2, the number of columns of the Hamming code is
(2m − 1)/(2 − 1) = 2m − 1, as seen earlier.
Another Example We now construct the Hamming code Hm with m = 3
over the field GF (3). The elements of GF (3) are 0, 1, 2 and addition and
multiplication are taken modulo 3. The number of non-zero vectors of length
3 over GF (3) is 33 − 1 = 26. But then for any vector of x of length 3 over
GF (3), 2x is also such a vector. Hence the number of distinct column vectors
of H = H3 in which no two column vectors span the same space is 26/2 = 13.
Hence H is a 3×13 matrix of rank 3 and it is therefore of dimension 13−3 = 10.
This shows that H3 is a (13, 310 , 3)-ternary perfect code. In fact, the condition
(*) is
10 13 13
3 + (3 − 1) = 310 (1 + 13 × 2) = 310 · 33 = 313 ,
0 1
We observe that the first non-zero entry in each column is 1. This makes H
unique except for the order of the columns. We may as well replace any column
vector v by 2v (where 0 = 2 ∈ GF (3)).
5.12 Exercises
1. Show by means of an example that the syndrome of a vector depends
on the choice of the parity-check matrix.
2.(a) Find allthe codewords of the binary code with generator
1 0 1 1 1
matrix .
1 0 0 1 1
(b) Find a parity-check matrix of the code.
(c) Write down the parity-check equations.
(d) Determine the minimum weight of the code.
3. Decode the received vector 1100011 in H4 using (i) the standard array
decoding, and (ii) syndrome decoding.
4. How many vectors of Z72 are there in S(u, 3), where u ∈ Z72 ?
5. How many vectors of F n are there in S(u, 3), where u ∈ F n , and |F | = q?
7. Show that the function d(X, Y ) defined in Section 5.4 is indeed a metric.
11. Show by means of an example that the generator matrix of a linear code
in Z73 (that is a code of length 7 over the field Z3 ) need not be unique.
12. Let C ⊆ Z33 be given by:
C = {(x1 , x2 , x3 ) ∈ Z33 : x1 + x2 + 2x3 = 0}.
A. List all the codewords of C.
B. Give a generator matrix of C.
C. Give another generator matrix of C (different from what you
gave in (ii)).
D. Calculate the minimum distance d(C) of C.
E. How many errors can C correct?
F. How many errors can C detect?
G. Give a parity-check matrix for C using (c) above.
13. Show by means of an example that the coset leader of a coset of a linear
code need not be unique.
We close this chapter with a brief discussion on a special class of linear codes,
namely, cyclic codes.
It is more precise to say that we identify the codeword (a0 , a1 , . . . , an−1 ) with
the residue class
But then as mentioned in Remark 3.1, we can identify the residue class with
the corresponding polynomial and do arithmetic modulo (xn − 1). The identi-
fication given by Equation 5.7 makes xg(x) = x(a0 + a1 x + · · · + an−1 xn−1 ) =
a0 x + a1 x2 + a2 x3 + · · · + an−1 xn = an−1 + a0 x + a1 x2 + a2 x3 + · · · + an−2 xn−1
Introduction to Coding Theory 243
Example 5.2
Let V be the space of all binary 3-tuples of R. Then, the cyclic codes
in V are the following:
Code Codewords in C Corresponding polynomials in R3
C1 (0,0,0) 0
C2 (0,0,0) 0
(1,1,1) 1 + x + x2
C3 (0,0,0) 0
(1,1,0) 1+x
(0,1,1) x + x2
(1,0,1) 1 + x2
C4 All of V All of R3
Note: Not every linear code is cyclic. For instance, the binary code C =
{(0, 0, 0), (1, 0, 1)} is the code generated by (1, 0, 1) but it is not cyclic (as
(1, 1, 0) ∈
/ C ).
To summarize, we have the following result.
Theorem 5.11. Let C be a cyclic code of length n over a field F , and let
Rn = F [x]/(xn − 1). Then, the following are true:
i. There exists a unique monic polynomial g(x) of least degree k ≤ n − 1
in C.
⎡ ⎤
a0 a1 a2 . an−k 0 0 0 . 0
⎢0 a0 a1 a2 . an−k 0 0 . 0 ⎥
⎢ ⎥
⎢0 . . . . . . . . . ⎥
G:⎢
⎢.
⎥.
⎥
⎢ . . . . . . . . . ⎥
⎣0 0 0 0 a0 a1 . . an−k 0 ⎦
0 0 0 0 0 a0 a1 . . an−k
Consequently dim C = k.
⎡ ⎤
1 0 1 1 0 0 0
⎢0 1 0 1 1 0 0⎥
G=⎢
⎣0
⎥.
0 1 0 1 1 0⎦
0 0 0 1 0 1 1
Introduction to Coding Theory 245
Example 5.3
Binary cyclic codes of length 7.
If C = g(x) is a binary cyclic code of length 7, g(x) ∈ Z2 [x] and
g(x)/(x7 − 1). Hence, to determine all binary cyclic codes of length
7, we first factorize x7 − 1 into irreducible factors over Z2 . In fact we
have:
x7 − 1 = (1 + x)(1 + x + x3 )(1 + x2 + x3 ).
As there are three irreducible factors on the right, there exist 23 = 8
binary cyclic codes of length 7.
Lemma 5.2. Let u(x) = a0 +a1 x+· · ·+an−1 xn−1 and v(x) = b0 +b1 x+· · ·+
bn−1 xn−1 be two polynomials in Rn = F [x]/(xn −1). Then, u(x)v(x) = 0 in Rn
iff the vector u1 = (a0 , a1 , . . . , an−1 ) is orthogonal to v1 = (bn−1 , bn−2 , . . . , b0 )
and to all the cyclic shifts of v1 .
246 Discrete Mathematics
Proof.
u(x)v(x) = 0
⇐⇒ (a0 bn−1 + a1 bn−2 + · · · + an−1 b0 )xn−1 + (a1 bn−1 + a2 bn−2 + . . .
+ a0 b0 )xn + · · · + (an−1 bn−1 + a0 bn−2 + . . . an−2 b0 )xn−2 = 0
⇐⇒ (on rearranging)(a0 bn−1 + a1 bn−2 + · · · + an−1 b0 )xn−1 +
(a0 b0 + a1 bn−1 + a2 bn−2 + · · · + an−1 b1 )xn−2 + · · · + (a0 bn−2
+ a1 bn−3 + · · · + an−2 b0 + an−1 bn−1 )xn−2 = 0
⇐⇒ each coefficient of xi , 0 ≤ i ≤ n − 1, is zero.
⇐⇒ u1 = (a0 , a1 , . . . , an−1 )is orthogonal to v1 = (b0 , b1 , . . . , bn−1 )
and to all the cyclic shifts of v1 .
⎡ ⎤
a0 a1 a2 . an−k 0 0 0 . 0
⎢ 0 a0 a1 a2 . a 0 0 . 0 ⎥
⎢ n−k ⎥
⎢0 . . . . . . . . . ⎥
⎢
G= ⎢ ⎥, and
. . . . . . . . . . ⎥
⎢ ⎥
⎣0 0 0 0 a0 a1 . . an−k 0 ⎦
0 0 0 0 0 a0 a1 . . an−k
⎡ ⎤
hk hk−1 hk−2 . h0 0 0 0 . 0
⎢0 hk hk−1 . h1 h0 0 0 . 0⎥
⎢ ⎥
⎢0 0 . . . . . . . .⎥
H= ⎢
⎢.
⎥,
⎢ . . . . . . . . .⎥⎥
⎣. . . . . . . . . .⎦
0 0 0 0 0 hk . . h1 h0
5.15 Exercises
1. Let G be the generator matrix of a binary linear code C of length n and
dimension k. Let G be the matrix obtained by adding one more parity
check at the end of each row vector of G (that is add 0 or 1 according
to whether the row vector is of even or odd weight). Let C be the code
generated by G . Show that G is an even-weight code (that is, every
word is of even weight). Determine H , a parity-check matrix of C .
2. Let C3 be the code of Example 5.2. Show that C3 is cyclic. Determine
dim(C3 ) and C3⊥ .
3. Determine all the binary cyclic codes of length 8.
4. Determine the binary cyclic code of length n with generator polynomial
1 + x.
Benjamin Franklin
Statesman and Scientist
6.1 Introduction
To make a message secure, the sender usually sends the message in a
disguised form. The intended receiver removes the disguise and then reads
off the original message. The original message of the sender is the plaintext,
and the disguised message is the ciphertext. The plaintext and the ciphertext
are usually written in the same alphabet. The plaintext and the ciphertext
are divided, for the sake of computational convenience, into units of a fixed
length. The process of converting a plaintext to a ciphertext is known as
enciphering or encryption, and the reverse process is known as deciphering
or decryption. A message unit may consist of a single letter or any ordered
k-tuple, k ≥ 2. Each such unit is converted into a number in a suitable arith-
metic and the transformations are then carried out on this set of numbers. An
enciphering transformation f converts a plaintext message unit P (given by
its corresponding number) into a number that represents the corresponding
ciphertext message unit C while its inverse transformation, namely, the deci-
phering transformation just does the opposite by taking C to P . We assume
that there is a 1–1 correspondence between the set of all plaintext units P
and the set of all ciphertext units C. Hence each plaintext unit gives rise to a
unique ciphertext unit and vice versa. This can be represented symbolically by
f f −1
P −→ ξ −→ C.
Such a setup is known as a cryptosystem.
249
250 Discrete Mathematics
C ≡ 4P + 2 (mod 272 ).
Further as (4, 27) = 1, 4 has a unique inverse (mod 272 ); in fact 4−1 = 547
(mod 729) as 4 · 547 ≡ 1 (mod 272 ). (Indeed, if 4x ≡ 1 (mod 729), 4(−x) ≡
−1 ≡ 728 ⇒ −x ≡ 182 ⇒ x ≡ −182 ≡ 547 (mod 729).) This when substi-
tuted in the congruence (6.4) gives
C = aP + b (mod 27),
we have 20 = a · 26 + b (mod 27),
and 21 = a · 4 + b (mod 27).
Subtraction yields
22a ≡ −1 (mod 27) (6.5)
As (22, 27) = 1, (6.5) has a unique solution, namely, a = 11. This gives
b = 21 − 4a = 21 − 44 = −23 = 4 (mod 27). The cipher has thus been hacked.
Here, a and b are the enciphering keys and a , b are the deciphering keys.
Now it is known that in the English language, the most frequently occurring
order pairs, in their decreasing orders of their frequencies, are “E(space)” and
“S(space).” Symbolically,
The first equation of (6.9) gives the encryption, while the second gives the
decryption. Notice that A−1 must be taken in Z27 . For A−1 to exist, we must
have gcd(det A, 27) = 1. If this were not the case, we may have to try once
again ad hoc methods.
2 1
As an example, take A = . Then, det A = 2, and gcd(det A, 27) =
4 3
gcd(2, 27) = 1. Hence 2−1 exists: in fact, 2−1 = 14 ∈ Z27 . This gives (Recall
A−1 = (1/det A)(adjA)), where (adjA) = adjugate of the matrix A = (Bij ).
Here, Bij = (−1)i+j Aji where Aji is the cofactor of aji in A = (aij ) (see
Chapter 3). Hence
−1 3 −1 42 −14 15 13
A = 14 = = over Z27 . (6.10)
−4 2 −56 28 25 1
6.4 Exercises
17 5
1. Find the inverse of A = in Z27 .
8 7
12 3
2. Find the inverse of A = in Z29 .
5 17
3. Encipher the word “MATH” using the matrix A of Exercise 1 above as
the enciphering matrix in the alphabet A to Z of size 26. Check your
result by deciphering your ciphertext.
x − y = 4 (mod 26)
7x − 4y = 10 (mod 26).
AMGQTZAFJVMHQV
Suppose we know by some means that the last four letters of the plain-
text are our adversary’s signature “MIKE.” Determine the full plain-
text.
plaintext, namely,
[0][1][0] [13] [3] [14] [13],
and the addition modulo 26 of the numerical equivalence of “XYZ,” namely,
[23] [24] [25] of the key. This yields
as the ciphertext.
C ≡M +K (mod 26)
Notwithstanding the fact that the key K is as long as the message M , the
system has its own drawbacks.
Despite these drawbacks, this cryptosystem was said to be used in some high-
est levels of communication such as the Washington-Moscow hotline.
There are several other private key cryptosystems. The interested reader
can have a look at these in public domains.
In 1976, the face of cryptography got altered radically with the invention
of public key cryptography by Diffie and Hellman [36]. In this cryptosystem,
the encryption can be done by anyone. But the decryption can be done only
by the intended recipient who alone is in possession of the secret key.
At the heart of this cryptography is the concept of a “one-way func-
tion.” Roughly speaking, a one-way function is a 1–1 function f which is such
that whenever k is given, it is possible to compute f (k) “rapidly” while it is
“extremely difficult” to compute the inverse of f in a “reasonable” amount of
time. There is no way of asserting that such and such a function is a one-way
function since the computations depend on the technology of the day—the
hardware and the software. So what passes for a one-way function today may
fail to be a one-way function a few years later.
As an example of a one-way function, consider two large primes p and q
each having at least 500 digits. Then, it is “easy” to compute their product
n = pq. However, given n, there is no efficient factoring algorithm as on date
that would give p and q in a reasonable amount of time. The same problem
of forming the product pq with p and q having 100 digits had passed for a
one-way function in the 1980s but is no longer so today.
(PA ◦ SA ) M = M = (SA ◦ PA ) M.
also authenticates A’s digital signature. This is in fact the method adopted in
credit cards.
We now describe two public key cryptosystems. The first is RSA, after their
inventors, Rivest, Shamir and Adleman. In fact, Diffie and Hellman, though
they invented public key cryptography in 1976, did not give the procedure to
implement it. Only Rivest, Shamir and Adleman did it in 1978, two years later.
ii. Each A chooses a small positive integer e, 1 < e < φ(n), such that
(e, φ(n)) = 1, where the Euler function φ(n) = φ(pq) = φ(p)φ(q) =
(p − 1)(q − 1). (e is odd as φ(n) is even).
Theorem 6.1 (Correctness of RSA). Equations 6.11 and 6.12 are indeed
inverse transformations.
Proof. We have
M ed ≡ M (mod n).
ed ≡ 1 (mod φ(n)).
M ed = M 1+k(p−1)(q−1)
= M · M k(p−1)(q−1) .
M p−1 ≡ 1 (mod p)
and therefore,
If, however, (M, p) = 1, then (as p is a prime) (M, p) = p, and trivially (as p
is a divisor of M )
M ed ≡ M (mod p).
Hence, in both the cases,
M ed ≡ M (mod pq),
ed
so that M ≡M (mod n).
The above description shows that if Bob wants to send the message M to
Alice, he will send it as M e (mod n) using the public key of Alice. To decipher
the message, Alice will raise this number to the power d and get M ed ≡ M
(mod n), the original message of Bob.
260 Discrete Mathematics
The security of RSA rests on the supposition that none other than Alice
can determine the private key d of Alice. A person can compute d if he/she
knows φ(n) = (p − 1)(q − 1) = n − (p + q) + 1, that is to say, if he/she knows
the sum p + q. For this, he should know the factors p and q of n. Thus, in
essence, the security of RSA is based on the assumption that factoring a large
number n that is a product of two distinct primes is “difficult.” However, to
quote Koblitz [30], “no one can say with certainty that breaking RSA requires
factoring n. In fact, there is even some indirect evidence that breaking RSA
cryptosystem might not be quite as hard as factoring n. RSA is the public
key cryptosystem that has had by far the most commercial success. But,
increasingly, it is being challenged by elliptic curve cryptography.”
As per the definition, logb y may or may not exist. However, if we take
G = Fq∗ , the group of non-zero elements of a finite field Fq of q elements and g,
a generator of the cyclic group Fq∗ (see [25]), then for any y ∈ Fq∗ , the discrete
logarithm logg y exists.
Example 6.1
∗ ∗
5 is a generator of F17 .
In F17 , the discrete logarithm of 12 with
∗
respect to base 5 is 9. In symbols: log5 12 = 9. In fact, in F17 ,
5
= 51 = 5, 52 = 8, 53 = 6, 54 = 13, 55 = 14, 56 = 2, 57 = 10, 58 = −1,
59 = 12, 510 = 9, 511 = 11, 512 = 4, 513 = 3, 514 = 15, 515 = 7, 516 = 1
This logarithm is called “discrete logarithm,” as it is taken in a finite
group.
Cryptography 261
Note that for any given a, an−1 (mod n) can be computed in polynomial time
using the repeated squaring method [25]. However, the converse of FLT is not
true. This is because of the presence of Carmichael numbers. A Carmichael
number is a composite number n satisfying (6.16) for each a prime to n. They
are sparse but are infinitely many. The first few Carmichael numbers are 561,
1105, 1729 (the smallest 4-digit Ramanujan number).
Since we are interested in checking if a given large number n is prime
or not, n is certainly odd and hence (2, n) = 1. Consequently, if 2n−1 ≡ 1
(mod n), we can conclude with certainty, in view of FLT, that n is composite.
However, if 2n−1 ≡ 1 (mod n), n may be a prime or not. If n is not a prime
but 2n−1 ≡ 1 (mod n), then n is called a pseudo-prime with respect to base b.
i. n is composite, and
n
More generally, we write a polynomial p(x) = i=0 pi xi of degree n as
Note that there are (n − 1) opening parentheses “(” and (n − 1) closing paren-
theses “)” . Let us write a pseudo-code for Horner’s method:
n
Input: A polynomial p(x) = i=0 pi xi of degree n and a constant c.
Output: The value of p(c).
The algorithm is given in Table 6.2.
Clearly, in the loop of the algorithm of Table 6.2, there are n additions
and n multiplications. The Miller-Rabin algorithm is based on the following
result in number theory.
x2 ≡ 1 (mod n)
Theorem 6.3. For an odd composite number n, the number of certificates for
the proof of compositeness of n is ≥ n−1
2 .
Proof. We have
n
n−1
n
(X + a) = X n + X n−1 ai + an .
n=1
i
If n is prime, each term ni , 1 ≤ i ≤ n − 1, is divisible by n. Further, as
(a, n) = 1, by Fermat’s Little Theorem, an ≡ a (mod n). This establishes
(6.17) as we have (X + a)n ≡ X n + an (mod n) ≡ X n + a (mod n).
268 Discrete Mathematics
We have
n n(n − 1) · · · (n − q + 1)
= .
q 1 · 2···q
Then, q k | nq . For if q k nq , as q k ||n , (n−1) · · · (n−q +1) must be divisible
by q, a contradiction (As q is a prime and q|(n − 1)(n − 2) . . . (n − q + 1),
q must divide at least one of the factors, which is impossible since each term
of (n − 1)(n − 2) . . . (n − q + 1) is of the form n − (q − k), 1 ≤ k ≤ q − 1).
Hence q k , and therefore n does not divide the term nq X n−q aq . This shows
that (X +a)n −(X n +a) is not identically zero over Zn (note that if an integer
divides a polynomial with integer coefficients, it must divide each coefficient
of the polynomial).
The above identity suggests a simple test for primality: Given input n,
choose an a and test whether the congruence (6.17) is satisfied, However,
this takes time Ω(n) because we need to evaluate n coefficients in the LHS
of congruence 6.17 in the worst case. A simple way to reduce the number of
coefficients is to evaluate both sides of Equation 6.17 modulo a polynomial of
the form X r − 1 for an appropriately chosen small r. In other words, test if
the following equation is satisfied:
n
(X + a) = X n + a (mod X r − 1, n) (6.18)
From Lemma 6.1, it is immediate that all primes n satisfy Equation 6.18 for all
values of a and r. The problem now is that some composites n may also satisfy
Equation 6.18 for a few values of a and r (and indeed they do). However, we
can almost restore the characterization: we show that for an appropriately
chosen r if Equation 6.18 is satisfied for several a’s, then n must be a prime
power. It turns out that the number of such a’s and the appropriate r are both
bounded by a polynomial in log n, and this yields a deterministic polynomial
time algorithm for testing primality.
Lemma 6.2. Let LCM(m) denote the lcm of the first m natural numbers.
Then, for m ≥ 9, LCM(m) ≥ 2m .
Proof. If n is prime, we have to show that AKS will not return COMPOSITE
in steps 1, 3 and 5. Certainly, the algorithm will not return COMPOSITE in
step 1 (as no prime n is expressable as ab , b > 1). Also, if n is prime, there
exists no a such that 1 < gcd(a, n) < n, so that the algorithm will not return
COMPOSITE in step 3. By Lemma 6.1, the ‘For loop’ in step 5 cannot return
270 Discrete Mathematics
We now consider the steps when the algorithm returns PRIME, namely,
steps 4 and 6. Suppose the algorithm returns PRIME in step 4. Then, n
must be prime. If n were composite, n = n1 n2 , where 1 < n1 , n2 < n.
Then, as n ≤ r, if we take a = n1 , we have a ≤ r. So in step 3, we would
have had 1 < (a, n) = a < n, a ≤ r. Hence the algorithm would have
output COMPOSITE in step 3 itself. Thus, we are left out with only one
case, namely, the case if the algorithm returns PRIME in step 6. For the
purpose of subsequent analysis, we assume this to be the case.
The algorithm has two main steps (namely, 2 and 5). Step 2 finds an
appropriate r and step 5 verifies Equation 6.18 for a number of a’s. We first
bound the magnitude of r.
Lemma 6.4. There exists an r ≤ 16 log5 n + 1 such that Or (n) > 4 log2 n.
Proof. Let r1 , . . . , rt be all the numbers such that Ori (n) ≤ 4 log2 n for each i
and therefore ri divides αi = (nOri (n) −1) for each i (recall that if Ori (n) = ki ,
then nki − 1 is divisible by ri ). Now for each i, αi divides the product
4 log2 n
P = (ni − 1)
i=1
t 2
We now use the fact that i=1 (ni − 1) < nt , the proof of which follows
readily by induction on t. Hence,
4 4 5
P < n16 log n
= (2log n )16 log n
= 216 log n
.
log5 n
lcm 1, 2, . . . , 16 log5 n ≥ 216 2
.
Hence there must exist a number r in 1, 2, . . . , 16 log5 n , that is, r ≤
16 log5 n + 1 such that Or (n) > 4 log2 n.
Step 5 of the algorithm verifies l equations. Since the algorithm does not
output COMPOSITE in this step (recall that we are now examining step 6),
we have
n
(X + a) = X n + a (mod X r − 1, n)
for every a, 1 ≤ a ≤ l. This implies that
n
(X + a) = X n + a (mod X r − 1, p) (6.19)
It is clear from Equations 6.19 and 6.20 that both n and p are introspective
for X + a, 1 ≤ a ≤ l. Our next lemma shows that introspective numbers are
closed under multiplication.
Next we show that for a given number m, the set of polynomials for which
m is introspective is closed under multiplication.
Lemma 6.6. If m is introspective for both f (X) and g(X), then it is also
introspective for the product f (X)g(X).
Equations 6.19 and 6.20 together imply that both n and p are intro-
spective for (X + a). Hence by Lemmas 6.5 and 6.6, every number in the
set I = ni pj : i, j ≥ 0 is introspective
for every polynomial in the set
l ea
P = a=1 (X + a) : ea ≥ 0 . We now define two groups based on the
sets I and P that will play a crucial role in the proof.
The first group consists of the set G of all residues of numbers in I mod-
ulo r. Since both n and p are prime to r, so is any number in I. Hence G ⊂ Zr∗ ,
the multiplicative group of residues modulo r that are relatively prime to r.
It is easy to check that G is a group. The only thing that requires verification
is that ni pj has a multiplicative inverse in G. Since nOr (n) ≡ 1 (mod r), there
exists i , 0 ≤ i < Or (n) such that ni = ni . Hence inverse of ni (= ni ) is
n(Or (n)−i ) . A similar argument applies for p as pOr (n) = 1. Let |G| = the
order of the group G = t (say). As G is generated by n and p modulo r and
since Or (n) > 4 log2 n, t > 4 log2 n. (Recall that all our logarithms are w.r.t.
base 2.)
To define the second group, we need some basic facts about cyclotomic
polynomials over finite fields. Let Qr (X) be the r-th cyclotomic polynomial
over the field Fp [28]. Then, Qr (X) divides X r − 1 and factors into irreducible
factors of the same degree d = Or (p). Let h(X) be one such irreducible factor
of degree d. Then, F = Fp [X]/(h(X)) is a field. The second group that we
want to consider is the group generated by X + 1, X + 2, . . . , X + l in the
multiplicative group F ∗ of non-zero elements of the field F . Hence it consists
of simply the residues of polynomials in P modulo h(X) and p. Denote this
group by G.
We claim that the order of G is exponential in either t = |G| or l.
Lemma 6.7. |G| ≥ min 2t − 1, 2l .
Proof. First note that h(X) | Qr (X) and Qr (X) | (X r − 1). Hence X may be
taken as a primitive r-th root of unity in F = Fp [X]/(h(X)).
We claim:
(*) if f (X) and g(X) are polynomials of degree less than t and if f (X) = g(X)
in P , then their images in F (got by reducing the coefficients modulo p and
then taking modulo h(X)) are distinct.
Cryptography 273
To see this, assume that f (X) = g(X) in the field F (that is, the images of
f (X) and g(X) in the field F are the same). Let m ∈ I. Recall that every
number of I is introspective with respect to every polynomial in P . Hence m
is introspective with respect to both f (X) and g(X). This means that
Finally we show that if n is not a prime power, then |G| is bounded above
by a function of t.
√
Lemma 6.8. If n is not a prime power, |G| ≤ (1/2)n2 t
.
274 Discrete Mathematics
Proof. Set Iˆ = ni · pj : 0 ≤ i, j ≤
t . If n is not a prime power (recall that
2
p|n), the number of terms in Iˆ = (
t + 1) > t. When reduced mod r, the
ˆ
elements of I give elements of G. But |G| = t. Hence there exist at least two
distinct numbers in Iˆ which become equal when reduced modulo r. Let them
be m1 , m2 with m1 > m2 . So we have (since r divides (m1 − m2 )),
X m1 = X m2 (mod X r − 1) (6.23)
m1
[f (X)] = f (X m1 ) (mod X r − 1, p)
= f (X m2 ) (mod X r − 1, p) by (6.23)
m
= [f (X)] 2 (mod X r − 1, p)
m
= [f (X)] 2 (mod h(X), p)(since
h(X) | (X r − 1)).
m1 m2
This implies that [f (X)] = [f (X)] in the field F . (6.24)
Now f (X) when reduced modulo (h(X), p) yields an element of G. Thus, every
polynomial of G is a root of the polynomial
Q1 (Y ) = Y m1 − Y m2 over F .
Thus, there are at least |G| distinct roots in F . Naturally, |G| ≤ degree of
Q1 (Y ). Now the degree of Q1 (Y )
Lemma 6.7 gives a lower bound for |G|, while Lemma 6.8 gives an upper
bound for |G|. These bounds enable us to prove the correctness of the algo-
rithm. Armed with these estimates on the size of G, we are now ready to prove
the correctness of the algorithm.
|G| ≥ min{2t − 1, 2l }
√
t log2 φ(r)
≥ min 2 − 1, 2
t 1 2√φ(r)
≥ min 2 − 1, n
2
1 √
≥ min 2t − 1, n2 t (since t divides φ(r))
2
√
1 √
≥ min 22 t log n , n2 t (since t > (2 log n)2 )
2
1 2√t
≥ n .
2
√
By Lemma 6.8, |G| < (1/2)n2 t if n is not a power of p. Therefore, n = pk for
some k > 0. If k > 1, then the algorithm would have returned COMPOSITE
in step 1. Therefore, n = p, a prime. This completes the proof of Lemma 6.9
and hence of Theorem 6.4.
It is straightforward to calculate the time complexity of the algorithm. In
these calculations, we use the fact that addition, multiplication, and division
operations between two m-bit numbers can be performed in time O (m). Sim-
ilarly, these operations on two degree d polynomials with coefficients at most
m bits in size can be done in time O (d · m) steps.
Theorem 6.5. The asymptotic time complexity of the algorithm is
O(log 10.5
n).
Proof. The first step of the algorithm takes asymptotic time O(log 3
n).
2
In step 2, we find an r with or (n) > 4 log n. This can be done by trying
out successive values of r and testing if nk = 1(mod r) for every k ≤ 4 log2 n.
For a particular r, this will involve at most O(log2 n) multiplications mod-
ulo r and so will take time O(log 2
n log r). By Lemma 6.4, we know that only
5
O(log n) different r’s need to be tried. Thus, the total time complexity of
step 2 is O(log 7
n).
The third step involves computing gcd of r numbers. Each gcd computa-
tion takes time O(log n), and therefore, the time complexity of this step is
O(r log n) = O(log6 n). The time complexity
of step 4 is just O(log n).
In step 5, we need to verify
2 φ(r) log n equations. Each equation
requires O(log n) multiplications of degree r polynomials with coefficients
of size O(log n). So each equation can be verified log2 n) steps.
in time O(r
Thus, the time complexity of step 5 is O(r 32 log3 n) =
φ(r) log3 n) = O(r
O(log 10.5
n). This time complexity dominates all the rest and is therefore the
time complexity of the algorithm.
276 Discrete Mathematics
Step 3:
gcd(a, 31) = 1 for all a ≤ 30 and gcd(a, 31) is either 1 or 31 for all
a ≥ 31. Hence, step 3 does not give anything decisive.
Step 4: n = 271 and r = 269, so n > r. Hence, step 4 does not give
anything decisive.
√
Step 5: φ(r) = 268, and
2 φr · log n = 264.
For all a, where 1 ≤ a ≤ 264,
(X + a)271 = X 271 + a(mod X 269 − 1, 271).
Step 6: The algorithm outputs that 271 is prime.
Appendix A: Answers to
Chapter 1—Graph Algorithms I
Exercises 1.13
Exercise 2: We suppose the graph has been already represented by n linked
lists L[i] of nodes for i = 1, 2, . . . , n where we assume the declaration struct
node {int v; struct node *succ;}; L[i] is the pointer to the list of all successors
of the node i in the graph. Note that a node consists of two fields: v of type
integer representing a vertex and succ is a pointer to a node in linked list.
Here, n is the number of vertices. When we delete the vertex k, we not only
delete the vertex k, but also the arcs going into the vertex k and leaving out
of k. The removed vertex k will point to an artificial node whose “v” field is
−1, which cannot be a vertex. We now write a fragment of a program in C.
The reader is asked to write a complete program in C based on this fragment
and execute on some examples.
Complexity: O(n + m)
Exercise 4: Note that when we remove an arc (directed edge) (i, j), the
vertices i and j remain in the graph. Removing the arc (i, j) means that we
remove the node containing the field j from the liked list L[i]. We now have a
fragment of a program in C. The reader is asked to write a complete program
and execute on some examples.
279
280 Appendix A
Complexity: O(n)
Exercise 6: True. Suppose a spanning tree T of G does not contain a bridge
e = xy of G. Since, T is a spanning connected graph it contains the vertices
x and y. Since the only elementary path between x and y in G is the edge xy
and this edge is not in T . This means that in the tree T , the vertices x and y
are not connected by any path, which is impossible.
Exercise 8: The reader is asked to draw the tree obtained by Table A.1.
1 2 3 4 5 1 2 3 4 5
⎛ ⎞ ⎛ ⎞
1 0 5 ∞ ∞ 5 1 0 5 ∞ ∞ 5
2⎜⎜∞ 0 7 ∞ ∞⎟ ⎟ 2⎜⎜∞ 0 7 ∞ ∞⎟ ⎟
M0 = 3 ⎜⎜∞ ∞ 0 3 ∞ ⎟. M1 = 3 ⎜
⎟
⎜∞ ∞ 0 3 ∞⎟ ⎟.
4⎝∞ ∞ ∞ 0 ∞⎠ 4⎝∞ ∞ ∞ 0 ∞⎠
5 6 ∞ 6 10 0 5 6 111 6 10 0
1 2 3 4 5 1 2 3 4 5
⎛ ⎞ ⎛ ⎞
1 0 5 122 ∞ 5 1 0 5 122 153 5
⎜ ⎟ ⎜ ⎟
⎜
2 ∞ 0 7 ∞ ∞ ⎟ 2⎜∞ 0 7 103 ∞ ⎟
⎜ ⎟ ⎜ ⎟
M2 = 3 ⎜
⎜∞ ∞ 0 3 ∞⎟ ⎜
⎟. M3 = 3 ⎜ ∞ ∞ 0 3 ∞⎟ ⎟.
⎜ ⎟ ⎜ ⎟
4⎝∞ ∞ ∞ 0 ∞⎠ 4⎝∞ ∞ ∞ 0 ∞⎠
5 6 111 6 10 0 5 6 111 6 93 0
M4 = M3
+
because d (4) = 0.
1 2 3 4 5
⎛ ⎞
1 0 5 115 145 5
2⎜⎜ ∞ 0 7 103 ∞⎟⎟
M5 = 3 ⎜
⎜ ∞ ∞ 0 3 ∞⎟⎟.
4⎝∞ ∞ ∞ 0 ∞⎠
5 6 111 6 93 0
1 2 3 4 5
⎛ ⎞
1 0 0 5 5 0
2⎜⎜0 0 0 3 0⎟⎟
INTER = 3 ⎜
⎜0 0 0 0 0⎟⎟.
4⎝0 0 0 0 0⎠
5 0 1 0 3 0
Exercise 14: The following Table A.4 gives us the dfsn of different vertices
of the graph of Figure A.4.
The following Table A.5 gives the LOW function of different vertices.
Exercise 16: The reader is first asked to draw the graph with direction of
the arcs as given in the exercise. As usual, we process the vertices in increasing
order and we suppose the vertices are listed in increasing order in each list
L[i] for 1 = 1, 2, . . . , 10. The following Table A.6 gives the dfsn of different
vertices.
Exercise 18: After assigning the given orientations, we observe that the
resulting directed graph is without circuits. Hence the topological sort is pos-
sible. As usual, we process the vertices in increasing order and in each linked
list L[i], with i = 1, 2, . . . , 10, the vertices are listed in increasing order. We
perform dfs of the graph by drawing only the tree arcs. We obtain two arbores-
cences in Figure A.6 (the reader is asked to perform the dfs):
Note that the second arborescence consists of only one vertex 4. We
now write the vertices of the forest of arborescence in postfix order:
9,10,6,5,8,7,3,2,1,4. We now take the mirror image, that is, write the postfix
order obtained from right to left: 4,1,2,3,7,8,5,6,10,9. This is the topological
Appendix A 285
order required. The reader is asked to draw the entire graph by aligning the
vertices horizontally and verify that there are no arcs going from right to left.
Exercises 2.5
Exercise 2: “Yes.” Let the source vertex be 1. Perform a breadth-first search
(bfs) from 1. This search partitions the vertex set X = { 1, 2, . . . , n } as:
Exercise 4: Perform the bfs from each vertex i, with 1 ≤ i ≤ n. The largest
integer k with Lk (i) = emptyset is the eccentricity of the vertex i. (See the
solution of Exercice 2.) The minimum of the eccentricities is the radius of the
graph, and the maximum of the eccentricities is the diameter of the graph.
Since we perform n bfs, and the cost of a single bfs is O(max(m, n)), the com-
plexity of the algorithm is O(n max(m, n)) where n and m are the number of
vertices and edges, respectively, of the graph.
Exercise 8: Let us first draw the Petersen graph (see Figure B.1). The
execution of the algorithm is illustrated in Table B.1. We find an augmenting
path visually. The reader is asked to draw the edges of the matching during
the execution by a color, say, red, the other edges black. Hence, an augmenting
path is an elementary path with alternating colors black and red, beginning
with a black edge and ending with another black edge. Note that the origin
and the terminus of such a path are both unsaturated by red edges. Note that
287
288 Appendix B
a black edge alone with its end vertices unsaturated is an augmenting path.
The matching obtained at the 4th iteration is a perfect matching.
Exercises 3.3.7
Exercise 2: The proof is by induction in n. The result is true for n = 1.
Assume the result for n so that
cos(nα) sin(nα)
Mn = .
− sin(nα) cos(nα)
Then,
n+1 n 1 cos(nα) sin(nα) cos(α) sin(α)
M =M ·M =
− sin(nα) cos(nα) − sin(α) cos(α)
cos(nα) cos α − sin(nα) sin α cos(nα) sin(α) + sin(nα) cos α
=
− sin(nα) cos α − cos(nα) sin α − sin(nα) sin α + cos(nα) cos α
cos(n + 1)α sin(n + 1)α
= .
− sin(n + 1)α cos(n + 1)α
Exercise 4: If
1 3 1 3 1 3 −5 9
A= , A2 = = ,
−2 2 −2 2 −2 2 −6 −2
and hence
−5 9 1 3 1 0 0 0
A2 − 3A + 8I = −3 +8 = .
−6 −2 −2 2 0 1 0 0
Now det(A) = 0 and hence A−1 exists. Multiplying by A−1 , we get
1 −3
−1 −1 −2 3
A − 3I + 8A−1 = 0 ⇒ A−1 = (A − 3I) = = 14 8
.
8 8 −2 −1 4
1
8
Exercise 6:
(i) Take
⎛ ⎞ ⎛ ⎞
a11 a12 ··· a1n b11 b12 ··· b1p
⎜ .. ⎟ ⎜ .. ⎟
A=⎝ . ⎠, B=⎝ . ⎠
am1 am2 ··· amn bn1 bn2 ··· bnp
so that product AB is defined. Now check that (AB) = B At . t t
289
290 Appendix C
Exercise 8:
if A = (aij ), A∗ = (bij ), where brs = āsr . Here, iA = (iars ),
(i) Recall that √
(Recall i = −1) and therefore (iA)∗ = (iasr )∗ = −(iāsr ) = −i(āsr ) =
−iA∗ .
(ii) H is Hermitian ⇔ H ∗ = H. iH is skew-Hermitian ⇔ (iH)∗ = −iH ⇔
−iH ∗ = −iH ⇔ H ∗ = H.
Exercise 10: Let A be a complex square matrix. Then, A = ((A + A∗ )/2)+
((A − A∗ )/2). Here, (A + A∗ )/2 is Hermitian since
(A + A∗ )∗ A∗ + (A∗ )∗ A∗ + A
= = .
2 2 2
Further
∗
A − A∗ A∗ − (A∗ )∗ A∗ − A A − A∗
= = =− .
2 2 2 2
Exercises 3.16
Exercise 2: Closure:
a 0 a 0 aa 0
= ∈ G.
b 1 b 1 ba + b 1
(i) Order is 4.
(ii) If
1 1 2 1 2 3 1 3
A= , then A = , A =
0 1 0 1 0 1
k 1 k
and so on, and, in general, A = and hence for no finite k, Ak =
0 1
I. Thus, ◦(A) = zero.
(iii) Order = 4
(iv) If Ak = I, then (detA)k = 1. But this is not the case. Hence ◦(A) = 0.
Exercise 10: Suppose to the contrary that a group G is the union of two
of its proper subgroups, say, G1 and G2 . Then, there exist x ∈ G \G2 (and
hence x ∈ G1 ) and y ∈ G \ G1 (and hence y ∈ G2 ). Then, look at xy. As
G = G1 ∪ G2 , xy ∈ G1 or G2 . If xy = x1 ∈ G1 , then y = x−1 x1 ∈ G1 , a
contradiction. A similar argument applies if xy ∈ G2 .
Exercise 12: The set of all invertible 2 × 2 real matrices or the set of all
diagonal 2 × 2 matrices with real entries and determinant not equal to zero.
Exercise 14:
Exercise 20:
(i)
−1 1 4 3 2 1 2 3 4
(α) = =
1 2 3 4 1 4 3 2
(ii)
−1 1 2 3 4 1 2 3 4 1 2 3 4
α βγ = = δ,
1 4 3 2 2 1 4 3 3 1 2 4
say then
δ(1) = α−1 βγ(1) = α−1 β(3) = α−1 (4) = 2,
(iii)
1 2 3 4 3 1 2 4 1 2 3 4
βγ −1 = = .
2 1 4 3 1 2 3 4 1 4 2 3
Exercise 24: The map φ : (Z, +) → (2Z, +) from the additive group of inte-
gers to the additive group of even integers is a group isomorphism. (Check).
⇒ ab = ba for all a, b ∈ G
⇒ G is an Abelian group.
Exercises 3.19
Exercise 2: Routine verification.
Exercises 4.8
Exercise 2: If f (x) = a0 + a1 x + · · · + an xn ∈ R[x], an = 0, and
g(x) = b0 + b1 x + · · · + bn xn ∈ R[x], bn = 0 are polynomials of degree n,
then bn f (x) − an g(x) is a polynomial in R[x] of degree < n. Hence the result.
⎛ ⎞ ⎛ ⎞
1 0 0 0 0 0
e11 ⎝
= 0 0 0⎠ , . . . , e33 = ⎝0 0 0⎠ ,
0 0 0 0 0 1
then if
⎛ ⎞
a11 a12 a13
A = ⎝a21 a22 a23 ⎠
a31 a32 a33
3 2
is any real matrix, then A = i,j=1 aij eij . Further, if A = 0, all the 3
coefficients aij are zero. Hence the 32 matrices eij are linearly independent
and span the space of all 3 × 3 real matrices over R. Hence dimension of the
vector space of all real 3 × 3 matrices over R is 32 . Now generalize to m × n
matrices.
295
296 Appendix D
Exercises 4.12
Exercise 2: The given system of equations is equivalent to
⎛ ⎞⎛ ⎞ ⎛ ⎞
4 4 3 −5 X1 0
⎜1 1 2 −3⎟ ⎜ X2 ⎟ ⎜0⎟
⎜ ⎟⎜ ⎟ = ⎜ ⎟.
⎝2 2 −1 0 ⎠ ⎝ X3 ⎠ ⎝0⎠
1 1 2 −2 4X4 0
A1 = ⎝3 −2 1⎠ − ⎝ 72 ⎠ 3 −5 4
1 −1 3 4
2
⎛ ⎞ ⎛ 9 −15 ⎞ ⎛ −7 7 ⎞
1 −4 5 2 2 6 2 2 −1
= ⎝3 −2 1⎠ − ⎝ 21 2
−35
2 14 ⎠ = ⎝ −15
2
31
2 −13⎠ .
1 −1 3 6 −10 8 −5 9 −5
A2 = 2 − 10 2 −1
9 −5 7
8 −767
= .
4 −257
Appendix D 297
A3 = −25 7 − 8 7
13
= 7
= 1 13 7
= L3 U3
= L2 U2 .
Therefore,
⎛ ⎞ ⎛ −7 7 ⎞
1 0 0 2 2 −1
A1 = ⎝ 15
7 1 0⎠ ⎝ 0 8 −76 ⎠
7 = L1 U1 .
10 1 13
7 2 1 0 0 7
Consequently,
⎛ ⎞
1 0 0 0 ⎛2 3 −5 4
⎞
⎜ 32 1 0 0⎟ ⎜0 −7 7
−1 ⎟
⎜ ⎟⎜
−76 ⎟
2 2
A= ⎜7 15
0⎟ ⎠ = LU.
⎝2 7 1 ⎠ ⎝0 0 8 7
10 1 13
2 1 0 0 0 7
7 2
3 7 15
⇒ Y1 = −8, Y1 + Y2 = −8, Y1 + Y2 + Y3 = 56 and
2 2 7
10 1
2Y1 + Y2 + Y3 + Y4 = 20
7 2
528 −52
⇒ Y1 = −8, Y2 = 4, Y3 = , and Y4 = .
7 7
We now determine X from the equation U X = Y.
U X = Y ⇒ 2X1 + 3X2 − 5X3 + 4X4 = −8
−7 7
X 2 + X3 − X 4 = 4
2 2
76 528
8X3 − X4 =
7 7
13 −52
X4 =
7 7
Solving backward, we get X4 = −4, 8X3 − (76/7)(−4) = (528/7) ⇒ 8X3 =
(528 − 304)/7 = 32 ⇒ X3 = 4. Hence ((−7)/2)X2 + (7/2)X3 − X4 = 4 gives
((−7)/2)X2 + 14 + 4 = 4 ⇒ X2 = 4.
Finally, 2X1 + 3X2 − 5X3 + 4X4 = 8 gives X1 = 8. Thus,
⎛ ⎞ ⎛ ⎞
X1 8
⎜X2 ⎟ ⎜ 4 ⎟
X=⎜ ⎟ ⎜ ⎟
⎝X3 ⎠ = ⎝ 4 ⎠ .
X4 −4
4(b): Similar to 4(a). Answer:
⎛ ⎞ ⎛ ⎞
X1 2
⎝X2 ⎠ = ⎝3⎠ .
X3 7
4(c): Since there are zeros in the principal diagonal of the matrix A, inter-
change the first two and the last two equations to avoid zeros in the main
diagonal.
Exercises 4.15
Exercise 2: Let F1 = GF (25 ) and F2 = GF (23 ). If F2 is a subfield of
F1 , then F2∗ , the multiplicative group of the non-zero elements of F2 should
be a subgroup of F1∗ . But |F1∗ | = 25 − 1 = 31, and |F2∗ | = 23 − 1 = 7 and 7 31.
2
Exercise 4: X 3 − X = X(X 8 − 1) = X(X 4 − 1)(X 4 + 1) = X(X − 1)(X +
1)(X 2 + 1)(X 4 + 1) = X(X + 2)(X + 1)(X 2 + 1)(X 4 + 1). Note that X 2 + 1
is irreducible over Z3 since neither 1 nor 2 is a root X 2 + 1 in Z3 . A similar
statement applies to X 4 + 1 as well.
Appendix E: Answers to
Chapter 5—Introduction to Coding
Theory
Exercises 5.12
Exercise 2(a): Take
1 0 1 1 1
G= over Z2 .
1 0 0 1 1
G is of rank 2, and so H is of rank 5 − 2 = 3 and each row of H is orthogonal
to every row of G. So H can be taken either as
⎛ ⎞ ⎛ ⎞
1 1 0 1 0 1 1 0 0 1
H1 = ⎝1 0 0 0 1⎠ or H2 = ⎝1 0 0 0 1⎠
0 1 0 0 0 0 1 0 1 1
Let ⎛ ⎞
1
⎜0⎟
⎜ ⎟
X=⎜
⎜1⎟
⎟
⎝1⎠
1
Then, S(x) with respect to H1 is
⎛ ⎞
0
H1 X T = ⎝0⎠ ,
0
while S(x) with respect to H2 is
⎛ ⎞
0
H2 X T = ⎝0⎠ .
1
Solutions to 2(b), 2(c) and 2(d) are similar to the steps in Example 5.1.
Exercise 4: u has 7 coordinates. s(u, 3) contains all binary vectors of length
7 which are at distances 0, 1, 2 or 3 from u and hence their number is 1 + 71 +
7
7
7
2 + 3 = 1 + 7 + 21 + 35 = 64. [Here, 1 arises from vectors which are at a
distance 1 from u etc.]
299
300 Appendix E
Exercise 8: Assume the contray, that is, there exists a set of 9 vectors in
Z62 (the space of binary vectors of length 6) which are pairwise at a distance
at least 3. Look at the last two coordinates of these 9 vectors. They should
be from {00, 01, 10, 11}. As there are 9 vectors, at least one of these pairs
must occur at least 3 times. Consider such a set of 3 vectors which have the
same last two coordinates. These coordinates do not contribute to the dis-
tance between the corresponding three vectors. Hence if we drop these last
two coordinates from these 3 vectors of Z62 , we get 3 binary vectors of length
4 such that the distance between any two of them is at least 3. This is clearly
impossible. This contradiction proves the result.
Exercise 12:
C.
1 0 1
G2 = .
2 2 1
Exercises 5.15
Exercise 2: C3 is cyclic since the three non-zero codewords are the cyclic
shifts of any one of them. For example, (1, 1, 0) −→ (0, 1, 1) −→ (1, 0, 1).
Since one of them is a linear combination (over Z2 ) of the other 2, as a gen-
erator matrix for C3 , we can take any two of these codewords. Hence
1 1 0
G=
0 1 1
Exercises 6.4
Exercise 2: det A = 12.7 − 5.3 = 204 − 15 = 189 = 15 in Z29 , and hence it
is prime to 29. Now 15 × 2 ≡ 30 ≡ 1(mod 29). Hence (detA)−1 = 15−1 = 2 in
Z29 . Therefore
−1 −1 17 −3 5 23
A = (detA) (adjA) = 2 =
−5 12 19 24
in Z29 .
Now detA = 3 which is prime to 26. Hence 3−1 exists in Z26 . In fact 3−1 = 9
as 3 · 9 ≡ 1 (mod 26). Therefore,
−4 1 −36 9
A−1 = 9 =
−7 1 −63 9
303
304 Appendix F
Note that we have to solve these equations in Z29 , and since 29 is a prime,
Z29 is actually a field. We can add and subtract multiples of 29 to make our
calculations become easier but must avoid multiplication by 29.
Now the first set of equations is equivalent to
12a + 7b = 12,
and 16a − 8b = 10,
⇒ 48a + 28b = 48,
48a − 24b = 30.
We now start deciphering pairs of letters in succession from left to right in the
ciphertext. The first
pair in the ciphertext is “AM ,” and its corresponding
0
column vector is . Hence it represents the plaintext
19
10 −3 0 −57
≡ (as 19 ≡ −10(mod 29))
1 16 19 −160
1
≡ (as − 160 ≡ 14(mod 29))
14
“BO”.
3. Aho A. V., Hopcroft J. E., Ullman J. D., The Design and Analysis of Algorithms,
Addison-Wesley, Reading, MA, 1975; Data Structures and Algorithms, Reading,
MA, 1983
9. Berge C., The Theory of Graphs and its Applications, Wiley, New York, 1958;
Graphs and Hypergraphs, North Holland, 1973.
11. Parthasarathy K. R., Basic Graph Theory, Tata-McGraw Hill Publishing Com-
pany, New Delhi, 1994.
13. Clark A., Elements of Abstract Algebra, Dover Publications Inc., New York,
1984.
14. Fraleigh J. B., A First Course in Abstract Algebra, 7th edition, Wesley Pub. Co.,
2003.
307
308 Bibliography
15. Bhattacharyya P. B., Jain S. K., Nagpaul S. R., Basic Abstract Algebra, 2nd
edition, Cambridge University Press, 1997.
16. Lewis D. W., Matrix Theory, World Scientific Publishing Co: and Allied
Publishers Ltd. (India), 1955.
18. Singh S., Linear Algebra, Vikas Publishing House, New Delhi, 2000.
19. www.ams.org/mcom/1962-16-079/S0025...1/S0025/S0025-5718-1962-0148256-
1.pdf
20. Hill R., A First Course in Coding Theory, Clarendon Press, Oxford, 2004.
21. Ling S., Coding Theory: A First Course, Cambridge University Press, 2004.
23. van Lint J. H., Introduction to Coding Theory, Springer GTM, 1999.
24. Pless V., Introduction to the Theory of Error Correcting Codes, 3rd edition,
Cambridge University Press, 1998.
26. Agrawal M., Lecture delivered on February 4, 2003 at the Institute of Mathe-
matical Sciences, Chennai on: “PRIMES is in P”. Reproduced in: Mathematics
Newsletter (Published by Ramanujan Mathematical Society), 13, 11–19 (2003).
28. Lidl R., Niederreiter H., Introduction to Finite Fields and Their Applications,
Cambridge University Press, 1986.
30. Koblitz N., A Course in Number Theory and Cryptography, 2nd Edition, GTM,
Springer, 1994.
31. Stinson D. R., Cryptography: Theory and Practice, Chapman & Hall, CRC, 2003.
33. Nair M., On Chebyshev-type inequalities for primes, Amer. Math. Monthly, 89,
120–129 (1982).
34. Rotman J. J., An Introduction to the Theory of Groups, GTM, Springer, 1999.
Bibliography 309
35. Bose R. C., Parker E. T., Shrikhande S. S., Further results on the construction of
mutually orthogonal latin squares and the falsity of Euler’s conjecture, Canadian
Journal of Mathematics, 12, 189–203 (1960).
36. Diffie W., Hellman M. E., New directions in cryptography, IEEE Transactions
on Information Theory, II-22(6), 644–654 (1976).
37. Cormen T. H., Leiserson C. E., Rivest R. L., Stein C., Introduction to
Algorithms, 3rd edition, MIT Press, 2009.
Index
311
312 Index