Finite Fields and Coding Theory
Finite Fields and Coding Theory
Postgraduate Curse
2016-2017
Mustansiriyah University/ College of Science/ Dept. of Math.
Dr.Emad Bakr Al-Zagana
Al-Mustansiriyah University College of Science / Dept. of Math.
2016-2017
References:
[1] R. Lidl and H. Niederreiter, Finite Fields: 2nd edition. Cambridge University
Press, 1997.
[2] John M. Howie, Fields and Galois Theory. Springer-Verlag, London, 2006.
[3] R. Hill, A First Course in Coding Theory, Clarendon Press, Oxford, 1986.
Chapter One
Chapter Two
Chapter Three
Chapter 1
Introduction to Error-Correcting
Codes
Motivation This theory shows how to solve a practical problem using the well-established
mathematical tools of Linear Algebra and Finite Fields.
Difference from Cryptography Coding Theory and Cryptography are two important
parts of the modern theory of Information Science.
Cryptography, which is some 2000 years old, is the mathematical theory of sending
secret messages.
Coding Theory, which only dates from 1948, is the mathematical theory of sending
messages that arrive with the same content as when they were sent.
Example 1.1. To send just the two messages YES and NO, the following encoding suffices:
YES = 1, NO = 0.
If there is an error, say 1 is sent and 0 arrives, this will go undetected. So, add some
redundancy:
YES = 11, NO = 00.
Now, if 11 is sent and 01 arrives, then an error has been detected, but not corrected, since
the original messages 11 and 00 are equally plausible.
So, add further redundancy:
Now, if 010 arrives, and it is supposed that there was at most one error, we know that 000
was sent: the original message was NO.
Note that the information is still in the first symbol; the other two are purely for
error-correction!
The philosophy Error correction codes are used to correct errors when messages are
transmitted through a noisy communication channel.
The channel may be a telephone line, a high frequency radio link or a satellite com-
munication link.
The noise may be human error, lightning, equipment faults, etc.
1
The object of a code is to encode the data by adding a certain amount of redundancy so
that the original message can be recovered if not too many errors occur in the transmission.
noise received decoded
message codeword vector message
message ?
- encoder - - decoder - user
source channel
In Example 1.1,
noise
NO 000 010 NO
message = YES = 111 ? 010 →
- - - - user
YES or NO NO = 000 channel 000
Definition 1.2. A binary code is a set of sequences of 0’s and 1’s; each sequence is a
codeword.
In Example 1.1, the code is {000, 111} . This is a binary repetition code of length 3.
Definition 1.3. (1) A q-ary code C is a set of sequences where each symbol is from a set
Fq = {λ1 , . . . , λq }. Usually, Fq is a finite field.
Example 1.4. The set of all words in the English language is a code over the 26-letter
alphabet {A, B, . . . , Z}. The codewords are not all the same length.
Example 1.6. The set of all 11-digit telephone numbers in the UK is a 10-ary code of
length 11. It is not designed for error-correcting, with area codes being important. However,
it would be possible to allow for a single misdial to be corrected.
Example 1.7. On a map laid out as a grid, HQ and JB have identical maps. For JB to return
to HQ, a message is transmitted in terms of the instructions N, E, W, S. The message is
E N N N N W W W W W N N N N E.
2
·· · ·· · · ·t HQ
··
··
··
··
··
··
··
··
··
· · · · · · · · · · · · · · · · · · · · · · · · · ··
··
··
··
··
··
··
··
··
··
JB · · ·· · · ··
t
3
Theorem 1.10. The Hamming distance is a metric.
(2) if a symbol is wrongly received, then each of the q − 1 errors is equally likely.
Example 1.13. In a binary code of length n,
Since this is greatest for i = 0, so nearest neighbour decoding is also maximum likelihood
decoding.
Example 1.14. C = {000, 111}, the binary repetition code of length 3.
Suppose 111 is transmitted. Then the received words decoded as 111 are
So
P (decoding as 111) = (1 − t)3 + 3t(1 − t)2 .
Suppose that t = 0.1, that is, one symbol in 10 is wrong. So
It will be shown, for linear codes, that P (incorrect coding), that is, the word error
probability, is independent of the codeword sent.
Definition 1.15. A code is e-error correcting if it can correct e errors.
4
Definition 1.16. A q-ary (n, M, d) code or (n, M, d)q code is a code C of length n, cardinality
M = |C| and minimum distance d over the alphabet Fq .
In Example 1.7,
Definition 1.17. For x0 ∈ (Fq )n and r ∈ Z, r ≥ 0, the ball of centre x0 and radius r is
Proof (i) Let d(C) = s + 1. If x ∈ C is sent and s mistakes occur in transmission, then the
received vector cannot be a codeword. So the mistakes are detected.
(ii)
..................................... ........
....................................
........ ..... .....
.....
......
e @ e
..... ....
.... .... ..... ....
..... ... ....
. ...
.. ... .. ...
. . ...
.... ... ....
@r
.. ...
... .. ... ..
... r r .. ... .
...
x y x ′
.. .. ..
... ..
. ... ...
... .
.. ... ...
...
.... ... ...
.... ...
.... .... .... ....
.....
.....
... ..... ....
.......
.......................................... ....... .....
......
..................................
Let d(C) = 2e + 1.
If x ∈ C is sent and y received with at most e errors, then d(x, y) ≤ e. If x′ ∈ C with
x′ 6= x, then d(x, x′ ) ≥ 2e + 1.
Suppose that d(x′ , y) ≤ e, then d(x, x′ ) ≤ d(x, y) + d(x′, y) ≤ 2e. Hence d(x′ , y) ≥ e + 1.
So y → x. Hence C can correct e errors.
Corollary 1.20. If C has minimum distance d, then it can detect d − 1 errors and correct
e = ⌊(d − 1)/2⌋ errors, where ⌊m⌋ denotes the integer part of m:
d 1 2 3 4 5 6 7 8
e 0 0 1 1 2 2 3 3
Definition 1.21. The q-ary repetition code of length n on Fq = {λ1 , ..., λq } is
{λ1 . . . λ1 , λ2 . . . λ2 , . . . , λq . . . λq }.
5
Example 1.22. (1) To send back photographs from the 1972 Mariner to Mars, a binary
(32, 64, 16) code was used. Here, 32 = 6 + 26, with 6 information symbols and 26
redundancy symbols. So each part of each photograph was coded in one of 26 = 64
shades of grey; 7 errors for each part could be corrected.
(2) For the 1979 Voyager spaceship to Jupiter, a binary (24, 642, 8) code was used. This
time, 24 = 12 + 12, with 12 information symbols and 12 redundancy symbols. So each
part of each photograph was coded in one of 212 = 642 = 4096 shades to send back
colour photographs, with 3 errors able to be corrected.
0 = .
1 = −
A 01 W 011
B 1000 X 1001
C 1010 Y 1011
D 100 Z 1100
E 0 0 1
F 0010 1 01
G 110 2 001
H 0000 3 00011
I 00 4 00001
J 0111 5 0
K 101 6 1000
L 0100 7 11000
M 11 8 100
N 10 9 10
O 111
P 0110
Q 1101
R 010
S 000
T 1
U 001
V 0001
6
Chapter 5
Linear Codes
x = (x1 , x2 , . . . , xn ) = x1 x2 · · · xn .
or, if d(C) = d, it is an
[n, k, d]-code or [n, k, d]q -code.
Note 5.2. A q-ary [n, k, d]-code is a q-ary (n, q k , d)-code.
Definition 5.3. The weight w(x) of x in V (n, q) is
since y − z ∈ C.
22
Example 5.6. The perfect (7, 16, 3)-code.
This is a binary [7, 4, 3]-code
C = {u, z, l1 , . . . , l7 , m1 , . . . , m7 }
Definition 5.7. A generator matrix G of an [n, k]-code C is a k × n matrix whose rows form
a basis for C.
Similarly,
C2 = {000, 011, 101, 110}
is a binary [3, 2]-code with generator matrix
0 1 1
G= ,
1 0 1
and
C3 = {00000, 01101, 10110, 11011}
is a binary [5, 2]-code with generator matrix
0 1 1 0 1
G=
1 1 0 1 1
Definition 5.10. Two linear codes C and C ′ in V (n, q) are equivalent if C ′ can be obtained
from C by one of the following operations:
(B) x1 x2 x3 · · · xn−1 xn −→ λ1 x1 λ2 x2 · · · λn xn .
The point about (A) and (B) is that they preserve the distance of any two codewords, and
the minimum distance of the code, as well as the dimension.
23
Theorem 5.11. If f : C → C ′ is a transformation obtained by using (A) and (B), with
f (C) = C ′ , then
Recall the row operations (R1), (R2), (R3). Now, what column operations do (A) and
(B) give? Let (C1), (C2), (C3) be the corresponding column operations.
(A) → (C2) ci ↔ cj ;
Theorem 5.12. Two k × n matrices G, G′ generate equivalent linear [n, k]-codes over Fq if
G′ can be obtained from G by a sequence of operations (R1), (R2), (R3), (C1), (C2).
Proof The (Ri) change the basis of a code; the (Cj) change G to G′ for an equivalent code.
Theorem 5.14. Let G be a generator matrix of an [n, k]-code. Then, by the elementary
operations, G can be transformed to standard form,
[Ik A],
Proof By row or column operations obtain a non-zero pivot g11 . Then use row operations
to obtain gi1 = 0, i > 1.
1 ∗ . . . ∗
0
0
′
G = . H
.
.
0
Use row or column operations on G′ to obtain h11 6= 0. Continue. Then use row operations
to get I, unless column operations are required.
24
(ii) C is a binary [6, 4]-code
1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1
0 1 0 0 1 1 0 1 0 0 1 1 0 1 0 0 1 1
G = 1 0 1 1 0 1 →
→
0 1 0 1 1 0 0 0 0 1 0 1
0 1 1 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0
1 1 1 0 1 1 1 0 0 0 1 1
0 1 0 0 1 1 0 1 0 0 1 1
→ 0 0 1 1 1 0
→
0 0 1 0 1 1
0 0 0 1 0 1 0 0 0 1 0 1
Corollary 5.16. If G1 = [Ik A1 ] and G2 = [Ik A2 ] are generator matrices of the same code
C, then A1 = A2 .
Proof The first row of G2 must be a linear combination of the rows of G1 , and hence is
the first row of G1 . Similarly for the other rows of G2 .
25
Chapter 7
x · y = x1 y1 + x2 y2 + · · · + xn yn
(i) (x + y) · z = x · y + y · z;
(iii) x · y = y · x.
(ii)
C = {0000, 1000, 0100, 1100},
C ⊥ = {0000, 0010, 0001, 0011}.
30
Proof (i) If y, y ′ ∈ C ⊥ , then
x · y = x · y ′ = 0 for all x ∈ C
⇒ x · (y + y ′) = 0 for all x ∈ C,
x · (λy) = 0 for all x ∈ C.
(ii)
xGT = 0 ⇐⇒ x[r1T , . . . , rkT ] = 0 ⇐⇒ x · riT = 0 for all i ⇐⇒ x · ri = 0 for all i,
Proof (i) By Lemma 7.5, C ⊥ is a linear code of length n over Fq . If G is a generator matrix
for C, with rows r1 , . . . , rk and columns c1 , . . . , cn , then
r1
G = [c1 , . . . , cn ] = ... .
rk
Then
n = dim(ker ϕ) + dim(im ϕ). (7.1)
As rank G = k, considering im ϕ in terms of the columns of G, so dim(im ϕ) = k. Hence,
from (7.1) dim(ker ϕ) = n − k.
Aliter, let G = [Ik A] be a generator matrix for C, then x ∈ C ⊥ ⇔ GxT = 0:
x1
1 0 ··· 0 a11 · · · a1,n−k ..
0 1 ··· 0 a21 · · · a2,n−k .
.. .. xk ;
. . .
..
0 ··· 1 ak1 · · · ak,n−k
xn
31
So any choice can be made for xk+1 , . . . , xn ; then x1 , . . . , xk are determined. Hence C ⊥ = q n−k .
Hence dim C ⊥ = n − k.
(ii) G = [Ik A], H = [−AT In−k ], rank H = n − k. Then
T −A
GH = [ Ik A ] = Ik (−A) + AIn−k = −A + A = 0.
In−k
dim (C ⊥ )⊥ = n − (n − k) = k.
If x = (x1 , x2 , x1 + x2 , x1 , x2 ) ∈ C3 ,
x1 + x2 + x3 = 0,
x1 + x4 = 0,
x2 + x5 = 0,
x = (x1 , x2 , x1 + x2 , x1 , x2 ).
32
Explanation for the term parity-check matrix If u = u1 · · · uk v1 · · · vn−k , where the
message symbols are u1 · · · uk ,
HuT = 0,
u1
..
.
u
c1 c2 · · · ck e1 e2 · · · en−k k ,
v1
.
..
vn−k
u1
..
.
uk
B | In−k = 0,
v1
.
..
vn−k
u1
.
b11 · · · b1k 1 0 ··· 0 ..
b21 · · · b2k 0 1 ··· 0 uk
.. .. .. = 0,
v1
. . .
· · · 1 ...
bn−k,1 · · · bn−k,k 0 0
vn−k
k
X
bij uj + vi = 0, for i = 1, . . . , n − k.
j=1
Syndrome Decoding
Definition 7.14. Let H be a parity-check matrix for the [n, k]-code C. Then for any y ∈
V (n, q),
sH (y) = yH T = (Hy T )T
is the syndrome of y, a vector of length n − k.
Lemma 7.15. (i) yH T = 0 ⇐⇒ y ∈ C;
(ii) x + C = y + C ⇐⇒ x and y have the same syndrome;
(iii) There exists a one to one correspondence between cosets and syndrome.
33
Algorithm 7.16. I. Set up 1-1 correspondence between coset leaders and syndromes.
II. If y is a received vector, calculate the syndrome s = yH T .
IV. Correct y to y − e.
Now much less needs to be stored, namely just coset leaders and syndromes.
Example 7.17. C3 = {00000, 10110, 01101, 11011} Single error-correcting [5, 2]-code.
1 1 1 0 0
1 0 1 1 0
G= H= 1 0 0 1 0
0 1 1 0 1
0 1 0 0 1
coset leader 00000 10000 01000 00100 00010 00001 11000 10001
syndrome 000 110 101 100 010 001 011 111
If the received message appears in the last two cosets we need to ask for retransmission,
since the weight of the coset leader is 2.
(i) y = 11110, yH T = 101, e = 01000,
x = y − e = y + e = 10110.
x = y + e = 01101.
if and only if some d columns of H are linearly dependent but every d−1 columns are linearly
independent.
x1 c1 + · · · + xn cn = 0.
Now, x has weight d − 1 ⇐⇒ ∃j1 , . . . , jd−1 ∈ N such that xj1 , . . . , xjd−1 6= 0 and all other
xj = 0 ⇐⇒ xj1 cj1 + · · · + xjd−1 cjd−1 = 0. Hence there exists no word of weight d − 1 if and
only if every d − 1 columns are linearly independent.
Similarly x is a word of weight d if and only if there exists i1 , . . . , id ∈ N such that
xi1 , . . . , xid 6= 0 and all other xi = 0; this occurs if and only if xi1 ci1 + · · · + xid cid = 0. Hence
there exists a word of weight d if and only some d columns are linearly dependent.
34
Corollary 7.19. (Singleton bound) For an [n, k, d]-code,
d ≤ n − k + 1.
Definition 7.21. An [n, k, d]-code over Fq with d = n−k +1 is maximum distance separable,
abbreviated MDS.
35