Cyclic Error-Correcting Codes: Daniel F. Russell December 14, 2001 Dr. Edward Green, Advisor
Cyclic Error-Correcting Codes: Daniel F. Russell December 14, 2001 Dr. Edward Green, Advisor
Information is passes every day in our society. It is essential that interference in the communication of this information hinders the information from being received as little as possible. Error-correcting codes provide us with this ability. Error-correcting codes allow us to receive a piece of information, identify any errors, locate them, and correct them. Cyclic codes are an especially useful kind of error-correcting code, and BCH codes and QR codes are especially useful kinds of cyclic codes. Error-correcting code theory has also been used in areas outside of information communication. Errorcorrecting code theory is an important subject to study. We will restrict our study to linear codes, that is, if a and b are codewords in a code C, then any linear combination of a and b are codewords in C. Consider the vector space V of all n-tuples of 0s and 1s, with addition component-wise mod 2. An [n,k] code is the set of all linear combinations of k independent vectors in V. So an [n,k] binary code has 2k codewords in it. This code has dimension k, and can be given by a basis. The generator matrix has the basis vectors of the code as the horizontal rows. For example, consider the [7, 4] binary code C1 whose generator matrix is
All codewords of C1 are linear combinations of these 4 vectors. C1 has 24 = 16 codewords. The information set of C1 is any independent set of k = 4 columns in G1. Say that the information set is the first 4 columns (so the 5th, 6th and 7th columns are redundancy positions).
Another way to represent codes in the parity check matrix. The rows of a parity check matrix are made up of the equations for the redundancy positions (the parity check equations). In our example, we can compute the redundancy positions by the equations: a5 = a 1 + a3 a6 = a 1 + a2 + a3 a7 = a4. These equations were found by examining the generator matrix, but they will hold for any codeword in C1. The parity check matrix for C1 is then:
This is because a vector (a1, a2, a3, a4, a5, a6, a7) is orthogonal to the first row of H1 if a1 + a3 + a5 = 0, which is the same as a5 = a1 + a3 (in binary), which is our first parity check equation. The other two rows of H1 relate to the other two parity check equations similarly. So C1 is the set of all 7-tuples that are orthogonal to each row of H1. The weight of a codeword is the number of its nonzero digits. The weight of the first row of G1 is 3. The minimum weight of a code, d, is the weight of the nonzero vector of smallest weight in the code. The minimum weight of a code can be found by examining the combinations of the rows of the generator matrix. The row of G1 of smallest weight has a weight of 2. Any linear combination of rows will have a weight greater than 2 since each row has exactly one 1 in the first 4 digits, all in different positions, and a row added to itself would result in the zero vector. A code is often expressed as [n,k,d], where d is the minimum weight. So C1 is a [7,4,2] code. The distance between two vectors u, v is the number of positions in which they differ, denoted d(u,v). The distance between the first 2 rows of G1 is 3. Distance is important in decoding. If a vector v is received, one good strategy for decoding it would
be to find the codeword u in the code such that d(u-v) is minimized. This is called maximum likelihood decoding. We call u v the error vector. If d is the minimum weight of a code C, then C can correct t = [(d-1) / 2] or fewer errors. So C1 can decode [(2-1) / 2] = 0 errors. So C1 is an ineffectual code. The Hamming [7,4] code, though, has a generator matrix
One special kind of code is a cyclic code. Cyclic codes correspond to ideals in certain commutative rings. A commutative ring R with unit is much like a field, but a nonzero element need not have an inverse. An ideal I in a commutative ring R is a set of elements in R such that (1) If a is in I, then ab is in I for all b in R and (2) If a and b are in I, then a + b and a b are in I. Let R = F[x]/(xn 1) where F is a field. A cyclic code C corresponds to an ideal in R. An ideal I in R is called a principal ideal if every element of I is a multiple of a fixed polynomial g(x). If C corresponds to the ideal in R = F[x]/(xn 1), then g(x) divides (xn 1). If g(x) = g0 + g1x + g2x2 + + gn-kxn-k, then a generator matrix of C is:
Note G has rank k. For an example we will construct the generator matrix for the cyclic code for n = 7 over the Reals. First we factor x7 1 = (x +1) (1 + x + x3) (1 + x2 + x3). Choose the generator polynomial g(x) to be 1 + x2 + x3. So C3 = <g(x)> is 4 dimensional and has the generator matrix:
i hk
0 0 . .
h k h 0 . . . 0
1 k
h h h . . . 0
k 2 1 k k
.. .. ..
h
1 h
0 h
0
0 0
.. .. .. . .
0 0 0 . . . h
1
0y 0 0 . . .
0 h
H= . 0 . ..
..
To find the parity check matrix, we find h(x) such that xn 1 = g(x)h(x). h(x) is called
the check polynomial. If h(x) = h0 + h1x + h2x2 + + hkxk, then the parity check matrix H of C is:
The reciprocal polynomial of h(x) can be used to construct the parity check matrix the same way g(x) was used for the generator matrix. So for our n = 7 example, h(x) = (1 + x) (1 + x + x3) = 1 + x2 + x3 + x4. The reciprocal polynomial then is x4(1 + x-2 + x-3 + x-4) = 1 + x + x2 + x4. So the parity check matrix is
The generator polynomial gives us much information about a code. Unfortunately, it is not always easy to find, as xn 1 is often quite difficult to factor.
Idempotent generators, though, can be found without factoring xn 1. An idempotent is a polynomial a(x) such that a2(x) = a(x). A generator e(x) of an ideal in R is called an idempotent generator if it is an idempotent. Every cyclic code has an idempotent generator. The matrix formed by e(x) and its next k-1 cyclic shifts form a generator matrix for the code. Idempotent generators are found from the union of cyclotomic cosets of n. A cyclotomic coset Ci is found by Ci = iPm (mod n) for all m (for binary codes, p is 2). The idempotent generators f(x) are formed by letting unions of the cyclotomic cosets Ci , for all i such that 0 i n, be powers of x that occur in f(x) with nonzero coefficients. For example, we find the idempotent generators for n=7. Letting p = 2, the cyclotomic cosets of n are C0 = {0}, C1 = {1, 2, 4,}, and C3 = {3, 6, 5}. So the idempotents for x7 1 are 0, 1, x + x2 + x4, x3 + x5 +x6, 1 + x + x2 + x4, 1 + x3 + x5 +x6, x + x2 + x3 + x4+ x5 +x6, and 1 + x + x2 + x3 + x4+ x5 +x6. All but 0 and 1 are idempotent generators of x7 1, since 1 generates the whole space and 0 generates the zero vector. Let k be the number of cyclotomic cosets for n. Let m be the smallest integer such that 2m 1 (mod p). Then k - 1 of the cyclotomic cosets have m elements, and one has one element (zero). If e(x) is the idempotent generator of a code C, then the generator polynomial g(x) of C equals the g.c.d.(e(x), (xn 1)). In our example, g.c.d. ((x + x2 + x4), (x7 1)) = 1 + x + x3, so the generator polynomial of x7 1 related to x + x2 + x4 is 1 + x + x3.
Bose-Chaudhuri-Hocquenghem (BCH) codes are a special family of cyclic codes designed to correct a certain number of errors. BCH codes are defined in terms of the roots of their generator polynomials. BCH codes are defined in the following way: Let C be a cyclic code of length n over GF(q) where g.c.d.(n, q) = 1. Let a be a primitive nth root of GF(qm), where m is the order of q mod n. C is a BCH code of designed distance d if the generator polynomial g(x) of C is the product of distinct minimal polynomials of the d - 1 elements ab, ab+1, , ab + d - 2. The d - 1 consecutive roots are a, a2, , ad - 1 if b = 1. C is called a primitive BCH code if n = qm 1. The minimum weight if a BCH code of designed distance d is at least d. Every vector in C is orthogonal to each row of matrix H:
We do not know the dimension k of our code C, though. The matrix H has d - 1 rows over GF(qm), and thus (d - 1)m rows over GF(q). So k n - m(d - 1). This is not too helpful. It can be proved, though, that for any positive integers m and t 2m-1 1, there is a binary BCH code of length n = 2m 1 that is t-error correcting and has dimension k n mt. This is a much better bound on k, and is the reason why BCH codes are so practical. Let us construct a binary double error-correcting BCH code of length 7, and call it C. So q = 2, n = 7, and t = 2. So m is the order of 2 mod 7, or 3. Let a be a root of x3 + x + 1, and thus a primitive root in GF(8). To find d, the designed distance, we solve 2 = [(d-1) / 2]. So d = 5. The generator polynomial g(x) of C must have roots a, a2, a3, and
a4. We know x3 + x + 1 has root a, and thus a2 and a4. The inverse of x3 + x + 1, x3 (x-3 + x-1 + 1 = x3 + x2 + 1, then, has roots a-1 = a6, a-2 = a5, and a-4 = a3. So let g(x) = (x3 + x + 1)(x3 + x2 + 1), since this has roots a, a2, a3, and a4 (among others). The parity check matrix of C, then, is
Decoding BCH codes is an involved process. The following discussion will be for binary BCH coeds with roots a, a2, , a2t. For a received vector v, Compute v(ai) = Si, where i = 1, 2, , 2t. Next, determine the maximum number r such that the determinant of M is not 0, where
Start with r = t. If |M| = 0, let r = t 1, and so on until |M| 0. Then r errors have occurred in v. Now find the coefficients the error-locator polynomial,
This can be done by finding coefficients of the equation Sj+r + s1Sj+r-1 + s2Sj+r-2 + + srSj = 0 for j = 1, , r. This can also be done using matrices by solving
where M-1 is the inverse of the matrix M given above. We can solve these equations because we know Si for i t. Next finds the roots of s(x), perhaps by just searching through the field. Finally, find the error locations and correct the errors. Using our previous double error-correcting binary code example, assume we receive a vector v(x) = x7 + x6 + x4 + x2 + 1. To find S1, we divide v(x) by x3 + x + 1 (since a is a root of x3 + x + 1) to get a remainder of x2 + x + 1. Since a3 + a + 1 = 0, we know a + 1 = a3, and since a5 = a2 + a3, we know (a5)2 + a5 + 1 = 0. Thus a root of x2 + x + 1 is a5 (Table 1 shows calculations of roots for all polynomials when a is a root of x3 + x + 1). Table 1: GF(8) where a is a root of the irreducible polynomial f(x) = x3 + x + 1 1 000 001 010 100 011 110 111 101 2 0 1 a a2 a3 a4 a5 a6 3 0 1 a a2 a2 a2 a2 + + a a a + + + 1 a 1
So S1 = a5. Then S2 = a3 and S4= a6. To find S3, divide v(x) by x3 + x2 + 1 to get a remainder of x2 + 1. Examining Table 1, we see that a6 is a root of x2 + 1, and so S3 = a6. So
So s(x) = 1 + ax + a4x2. To find roots for s(x) we search through GF(8), and find a4 and a5. So the errors are at positions 7 4 = 3 and 7 5 = 2 (numbering the positions starting at 0 from right to left instead of left to right as in the beginning).
Another family of cyclic codes is quadratic residue (QR) codes. QR codes are defined over a finite field and easily constructed from their idempotent generators. For this discussion we will only consider binary QR codes. Let F be the field GF(p) where p is an odd prime, and let G be the multiplicative group of nonzero elements in F. Note G has order p 1. Let Q be a subgroup of G such q Q iff there exists a g G such that g2 = q. The elements of Q are called quadratic residues. Q has order (p-1)/2. It can be shown, through a difficult proof, that 2 is a quadratic residue iff p 1 (mod 8) or p -1 (mod 8). Also, -1 is a quadratic residue in GF(p) iff p 1 (mod 4). For such a p let h be the subgroup of Q such that q H iff q = 2w for some w. QR codes are sent onto themselves by permutations i ai(mod p) where a Q. For any a, if the permutation ma defined by im = ai(modn) sends a cyclic code C with idempotent e(x) onto a code C, then C is cyclic with idempotent e(x) ma. For the remainder of this discussion we will assume that p (+ or )1 (mod 8) so that 2 is a quadratic residue. The quadratic residues, then, are a union of cyclotomic cosets. Let m be the order of 2 mod p. Then each nonzero cyclotomic coset has m elements. There are k of these cyclotomic cosets, where mk = p 1. So the quadratic residues are a union of k/2 cyclotomic cosets, and nonresidues, N, make up the union of the other k/2 cyclotomic cosets. The only e(x) that can be fixed by the quadratic residue permutation are
because their corresponding codes are Rp, all even weight vectors, and <h> (the vector of all ones), respectively. Let p be a prime such that p -1(mod 8). Let e1(x) = SiQxi and e2(x) = SiNxi. Let Q1 be the QR code with idempotent e1(x) and Q2 be the QR code with idempotent e2(x). Let Q1 be the QR code with idempotent 1 + e2(x) and Q2 be the QR code with idempotent 1 + e1(x). Then the four QR codes have the following properties: (i) (ii) (iii) (iv) (v) (vi) Q1 and Q2 are equivalent; Q1 and Q2 are equivalent. Q1 Q2 = <h>, and Q1 + Q2 = Rp. dim Q1 = dim Q2 = (p + 1)/2. Q1 = Q1 + <h>; Q2 = Q2 + <h>. dim Q1 = dim Q2 = (p 1)/2. Q1 and Q2 are self-orthogonal and Q1 = Q1; Q2 = Q2.
The situation when p 1(mod 8) is similar. Rules (i) (v) above apply if you change the definitions around so that Q1 has the idempotent generator 1 + e1(x), Q2 has 1 + e2(x), Q1 has e2(x), and Q2 has e1(x). Rule (vi) does not apply, and should be replaced with (vi) Q1 = Q2 and Q2 = Q1.
We can extend Q1 and Q2 by adding an overall parity check coordinate, labeled . The coordinates are now labeled , 0, 1, , p 1. I will call these new extended codes Q1 and Q2. Q1 and Q2 are equivalent [p + 1, (p + 1)/2] codes. If p 1(mod 8), Q1^ = Q2 and Q2^ = Q1. Q1 and Q2 contain only even weight vectors. Define PSL2(p) to be the group of permutations on , 0, 1, , p 1 for a prime p that is generated by the following permutations of elements i of GF(p):
s: i i + 1(modp) ma: i ai(mod p) for a Q r: i 1 / [i(mod p)]. i 0. Note that s and ma map to and r maps 0 to and to 0. It can be shown that PSL2(p) has order (p 1)p(p + 1) / 2. PSL2(p) is contained in the group of Q1 or of Q2 (the group of a code is the set of all permutations that send C onto itself). The minimum weight, d, of a QR [p, (p + 1)/2] code is odd. If p -1(mod 8), the number Ai of vectors of weight i in a QR code of length p is 0 unless i (0 or 3)(mod 4). If d is the minimum weight of a QR [p, (p + 1)/2] code of length p, then d2 p. If p 1(mod 8), then d2 d + 1 p. Now we will examine how to decode QR codes. Let y be the received vector of length p. Let I be an information set. Let S = {s1, , sr}be the smallest set of permutations in G(C) chosen from combinations of s, ma, and r as defined above such that any given set of t coordinate positions is disjoint from at least on image of I under a permutation in S. Let P be a set of parity check equations derived from idempotent generators of C. Compute w1 by encoding the positions I using P. If d(w1, y) t, decode y to w1. Otherwise, compute P2 = Ps1, giving a new set of parity check equations. Next compute w2 by encoding the information positions in Is1 with the parity checks in P2. If d(w2, y) t, decode y to w2. Otherwise continue the above steps until either y is decoded or all the permutations in S have been used. It is possible that y will not be decoded. Perhaps this decoding algorithm can best be explained through an example. We will use the [7,4,3] QR code with idempotent e1 = (0, 0, 0, 1, 0, 1, 1). We know 1 + e2 = (1, 1, 1, 0, 1, 0, 0) and that its cyclic shifts generate the orthogonal code. We can assume
the first 4 positions {0, 1, 2, 3} are the information set . So we can derive the parity check equations: a4 = a0 + a1 + a2, a5 = a0 + a1 + a3, a6 = a1 + a2 + a3.
Suppose that the sent vector x = e1 and that the received vector y has an error in position 2, so y = (0, 1, 0, 1, 0, 1, 1). We compute w1 by taking the same information set from y and using the parity check equations to find the redundancy positions. So w1 = (0, 1, 0, 1, 1, 0, 0) and d(w1, y) = 3 > t. So we use the permutation s2 to get the new information set {2, 3, 4, 5}. Applying s2 to the old parity check equations we get a6 = a2 + a3 + a4, a0 = a2 + a3 + a5, a1 = a3 + a4 + a5.
So w2 = (0, 0, 0,1, 0, 1, 1) and d(w2, y) = 1 t. So we decode y to w2. Note w2 = x = e1, so y was decoded properly.
One way coding theory has been applied recently is in the hat color problem. The idea of the problem is that n players sit in a circle and each is given either a red or a blue hat. Each player can see all the other players hats, but not her own. No communication is allowed. The object of the game is for at least one player to guess her hat color correctly, and none guess incorrectly (any player or players can refrain from guessing if he wishes). This may seem like complete chance, but R. Berlekamp has given a solution where the game can be won n/n+1 of the time. The idea of the solution for small numbers is not that difficult. Let n = 3. There is a 75% probability that there will be 2 hats of one color, and 1 of the other. If a player sees the other 2 have different color hats, she passes. If she sees that the other 2 players have the same color hat, though, she guesses that her hat is the opposite color. This plan will work 75% (n/n+1) of the time.
The solution for bigger values of n requires coding theory. Berlekamp used BCH codes and other properties of coding theory. I plan to study this problem and its solution more next semester. Error-correcting code theory is essential to our modern life. The rapid growth of the amount of information needed to be transmitted make it very important to continue our study of this subject. Codes that are more efficient to transmit, correct more errors, and are more efficient to decode are always needed. For example, a more efficient way to decode QR codes would be very helpful. With increase demands for information transfer, in addition to new uses for the subject in other areas, the importance of research in error-correcting code theory will only increase as time goes on.