Coding Theory and Applications PDF
Coding Theory and Applications PDF
Enes Pasalic
University of Primorska
Koper, 2013
Contents
1 Preface
2 Problems
Preface
This is a collection of solved exercises and problems of linear codes for students who have a
working knowledge of coding theory. Its aim is to achieve a balance among the computational
skills, theory, and applications of cyclic codes, while keeping the level suitable for beginning
students. The contents are arranged to permit enough flexibility to allow the presentation of a
traditional introduction to the subject, or to allow a more applied course.
Enes Pasalic
[email protected]
Problems
In these exercises we consider some basic concepts of coding theory, that is we introduce the
redundancy in the information bits and consider the possible improvements in terms of improved
error probability of correct decoding.
1. (Error probability): Consider a code of length six n = 6 defined as,
(a1 , a2 , a3 , a2 + a3 , a1 + a3 , a1 + a2 )
where ai {0, 1}. Here a1 , a2 , a3 are information bits and the remaining bits are redundancy
(parity) bits. Compute the probability that the decoder makes an incorrect decision if the
bit error probability is p = 0.001. The decoder computes the following entities
b1 + b3 + b4 = s1
b1 + b3 + b5 == s2
b1 + b2 + b6 = s3
Therefore, based on the knowledge of b the receiver computes the linear combinations of error bits. Given s1 , s2 , s3 the decoder must choose the most likely
error pattern e which satisfies the three equations. Notice that there is a unique
solution for e for any (s1 , s2 , s3 ) 6= (1, 1, 1).
E.g. assume that s = (0, 0, 1) then obviously (e1 , . . . , e6 ) = (0, 0, . . . , 1) satisfies
the above equations. Can there be some other e of weight one to get s = (0, 0, 1)
?
Let us try (e1 , . . . , e6 ) = (1, 0, . . . , 0). Then s2 = 1 etc. Therefore, an error pattern
of weight 1 is decoded correctly.
Now if (s1 , s2 , s3 ) = (1, 1, 1) the decoder must choose one of the possibilities,
(1, 0, 0, 1, 0, 0)
(0, 1, 0, 0, 1, 0)
(0, 0, 1, 0, 0, 1)
Remark: We will later see (when we introduce standard array and syndrome
decoding) that only if the coset leader that correspond to the syndrome appears
to be the error pattern then we make a correct decision. Among all other error
patterns there is one with two errors that is decoded correctly. Do not forget that
we only consider the solutions of minimum weight for e.
What about the rest ? Firstly, one can find out there are only 4 essentially different
3-tuples (s1 , s2 , s3 ) (all of which would lead to a similar computation) . Nevertheless, one may examine all the 7 nonzero syndromes and compute the probability
that at least one information symbol is incorrect.
As an example of calculation let us consider (s1 , s2 , s3 ) = (1, 1, 0). The decoder
equations gives the solution space for e to be:
(101011), (011101)
(110000), (010011)
(100101), (000110)
(111110), (001000)
The bold face entry is the most likely pattern and the receiver takes the decision
that e3 = 1 and ei = 0 for i 6= 3. But what if there was not one error during the
transmission. Then,
P b(two correct information symbols = p2 (1 p)4 + 2p4 (1 p)2 .
This corresponds to the vectors (000110) and (101011), (011101) respectively. The
first one occurs with probability p2 (1 p)4 . If the actual error was (000110) and
we used (001000) in the decoding then the third information bit a3 will be incorrect. The same applies for (101011), (011101) where the correction agrees with
actual errors in the third position so the first and third information bit is incorrect
respectively.
We can also obtain two bit errors if the actual error vectors are (111110), (010011),
(100101). This occurs with probability :
P b(one correct information symbols = p5 (1 p) + 2p3 (1 p)3 .
Finally, all the information symbols might be incorrect which occurs if we correct
r using e=(001000) and the actual error vector was (110000). This occurs with
probability Pb = p3 (1 p)3 . One can proceed in this way by examining the other
possibilities for (s1 , s2 , s3 ) and calculating corresponding probabilities. After some
tedious counting one arrive at something like
1
PS = (22p2 (1 p)4 + 36p3 (1 p)3 + 24p4 (1 p)2 + 12p5 (1 p) + 2p6 ).
3
For p = 0.001 this gives Ps = 0.000007 !
3. (Gilbert-Eliot channel model) This exercise introduces a useful channel model in the presence
of burst errors, i.e. we assume that the errors comes in a consecutive manner and the channel
remains some time in that state. The channel later returns with some probability to a good
state. This is presented in the following figure.
1-z
1-z
GOOD
BAD
The error burst length has a geometric probability mass function, so average error
burst length
X
=
kP (error burst of length k)
k1
k(1 z2 )k1 z2
k1
X
X
X
= z2 (1 z2 )k1 +
(1 z2 )k1 +
(1 z2 )k1 +
k1
k2
k3
= z2 (1/z2 ) + (1 z2 )/z2 + (1 z)2 /z2 +
= = 1 + (1 z2 ) + (1 z2 )2 +
1
=
= 10.
z2
b)Find the average length of an error-free sequence of bits.
Similar to the last part, the average length of an error-free sequence of bits is
the average time of staying in good state, which means that average error-free
sequence length is
X
=
kP (error-free sequence of length k)
k1
k(1 z1 )k1 z1
k1
1
= 106 .
z1
1 0 0 0 1 1
H= 0 1 0 1 0 1
0 0 1 1 1 0
(a) Find the generator matrix of C .
(b) The parity check matrix H does not allow the presence of the codewords of weight < 3
(apart from the all zero codeword). Explain why ?
(c) Suppose that the code is used for error detection only over a binary symmetric channel
with error rate p = 103 . Find the probability of undetected error.
Hint: W.l.o.g. assume that all zero codeword was transmitted. Be careful with the
interpretation of undetected error.
5. (Code design) Suppose we have a binary channel with bit error rate 105 . Packets (codewords) transmitted over this channel will consist of 1000 bits including check bits. We want
to specify codes that satisfy the reliability requirements of several scenarios. Assume that
the number of check bits needed to correct t errors in the received packet is 10t for t 10.
a) Suppose that no coding is used. What is the probability of receiving a packet with
k bit errors ?
The number of bit errors has a binomial probability distribution. For this channel,
1000
P (k errors in 1000 bits ) =
(105 )k (1 105 )1000k
k
1000 999 (1000 k + 1)
=
k!
105k (1 105 )1000k .
For small k, use (1105 )1000k 0.99 1 and 1000999 (1000k+1) 1000k .
Therefore we can approximate the above probability by (102 )k /k!.
b) Suppose that the bit error rate after packet decoding is required to be less than 109 . In
your code design, how many bit errors must be correctable in each packet in order to satisfy
this requirement?
The bit error rate can be expressed as the average number of incorrect bits per
packet after decoding, divided by the number of bits per packet. If a code can
correct up to t errors, then when t + 1 errors occur the worst that can happen is
that the decoder flips t bits to arrive at an incorrect codeword. Thus a conservative
assumption is that t + 1 errors in a packet result in 2t + 1 bit errors after decoding,
and the contribution to the bit error rate is,
P (t + 1 errors in packet )
2t + 1
.
1000
6. (Error probability): Let C be a ternary repetition code of length 4 over the alphabet {0, 1, 2}.
(a) List the vectors which will be uniquely decoded as 1111 using nearest neighbour decoding.
(b) If the probability of each symbol being wrongly received is t and each symbol is equally
likely, find the word error probability; that is, the probability Pe of a word being
incorrectly decoded.
(c) What is Pe when t = 0.05 ?
Solution: The code C = {0000, 1111, 2222}. Hence C corrects (4 1)/2 = 1
error. However,some words at distance 2 from a codeword can also be corrected.
(a) The received words decoded as 1111 are as follows:
1111;
0111, 2111, 1011, 1211, 1101, 1121, 1110, 1112;
0211, 2011, 0121, 2101, 0112, 2110, 1021, 1201, 1012, 1210, 1102, 1120.
(b) First, note that
P(1 being received) = 1 t
and
P(0 or 2 being received) = t
thus
P(0 being received) = P(2 being received) = t/2.
Hence, the probability of correct decoding of the word 1111 is,
Pc = (1 t)4 + 8(t/2)(1 t)3 + 12(t/2)2 (1 t)2
= (1 t)2 {(1 t)2 + 4t(1 t) + 3t2 }
= (1 t)2 (1 + 2t).
Hence the probability of a word being incorrectly decoded is,
Pe = 1 (1 t)2 (1 + 2t) = t2 (3 2t).
(c) When t = 0.05 then Pe 3t2 = 0.0075.
7. (Error probability II): Let C be the binary repetition code of length 5. List the vectors which
will be uniquely decoded as 11111 using nearest neighbour decoding. Do parts (b) and (c)
of the previous question for this code.
Here C = {00000, 11111}. So the words decoded as 11111 and their probabilities
are,
11111
(1 t)5 ;
01111, (5 like this)
5t(1 t)4 ;
00111, (10 like this) 10t2 (1 t)3 .
Therefore,
Pc = (1 t)5 + 5t(1 t)4 + 10t2 (1 t)3
=
= (1 t)3 (1 + 3t + 6t2 ).
9
10
Solution: For raw bit error rate p = 104 , the probability of two errors in a
codeword is given by the binomial distribution:
15
P (2 errors ) =
(104 )2 (1 104 )1 3 = 1.05 106
2
One can compute the probability of three or more errors which is 4.6 1010 ,
which is negligible.
When two errors occur in a received codeword of length 15, the decoder miscorrects
by changing one more bit to arrive at a codeword that differs from the transmitted
codeword in three bits. The probability that any particular bit within a codeword
(information bit or check bit) is incorrect is the average number of wrong bits per
codeword divided by the number of bits per codeword. Therefore the cooked bit
error rate is,
3
P (2 errors )2.1 107 .
15
For raw error rate p < 2 103 , an accurate approximation to the cooked error
rate is,
3 15 2
p = 21p2 .
15 2
10. (Linear independence ): Is it true that if x, y, and z are linearly independent vectors over
GF (q), then so are x + y, y + z, and z + x ?
Solution: The vectors are not linearly independent over fields of characteristic 2,
since over GF (2m )
(x + y) + (y + z) + (z + x) = (x + x) + (y + y) + (z + z) = 0 + 0 + 0 = 0.
On the other hand, when q is odd, x, y, and z can be expressed as linear combinations of the three vectors, which therefore span a subspace of dimension 3. For
example,
x = 21 ((x + y) + (z + x) (y + z)).
11. (Subspaces ): Given that S and T are distinct two- dimensional subspaces of a threedimensional vector space, show that the intersection of S and T is a one-dimensional subspace.
Solution: There are two reasonable approaches. Both start from the fact that
the intersection of S and T is a linear subspace whose dimension is at most 2. We
must show that the dimension is not 0 or 2, and is therefore exactly 1.
Proof 1 (Direct): S T cannot have dimension 2, because in this case any basis
for S T would also span S and T , contradicting the hypothesis that S and T are
distinct.
S T cannot have dimension 0, because in this case the set of four vectors consisting of a basis for S and a basis for T would be linearly independent, which
contradicts the hypothesis that S and T are subspaces of a 3-dimensional vector
11
space.
Proof 2: (Using orthogonal complements) The orthogonal complements of S and
T have dimensions 3 2 = 1. The orthogonal complements are distinct because S
and T are distinct. The linear subspace generated by the orthogonal complements
has dimension exactly 2; the dimension is greater than 1 because the orthogonal
complements are distinct and is at most 2 because their dimensions are each 1.
But the linear combinations of the orthogonal complements of S and T form the
orthogonal complement of S T . So the dimension of S T is 3 2 = 1.
12. (Standard form): Let C be a binary (5, 3) code
1 0
G= 1 1
0 1
1 1 0
0 1 0
0 0 1
1 0 1 1
G= 1 1 0 1
0 1 0 0
1
1 0 1 1 0
0 0 1 0 1 0
0
0 1 0 0 1
0
1 0 1 1
0 0 1 1 0
1
0 1 0 0
1
0 0 1 1
0 1 0 1 0
0
1 0 0 1
0
0
1
0 0 1 1
1 0 0 1
0 1 0 1
H = [A Ik ] =
1 0 0 1 0
1 1 1 0 1
(c) Write out the elements of the dual code C . The elements of C are linear
combinations of the rows of H,
= C = {00000, 10010, 11101, 01111}.
13. Modified block codes. Some of the following operations on rows or columns of the generator
matrix G or the parity-check matrix H may decrease the minimum Hamming weight of a
linear block code? Which of the operations below can cause a reduction in the minimum
weight.
(a) Exchanging two rows of G.
(b) Exchanging two rows of H.
(c) Exchanging two columns of G.
(d) Exchanging two columns of H.
12
1 0
G= 1 1
0 1
1 1 0
0 1 0
0 0 1
G= 1
0
form.
0 1 1 0
1 0 1 1 0
1 0 1 0 0 1 1 0 0
1 0 0 1
0 1 0 0 1
1 0 1 1 0
1 0 0 1 1
1 0 0 1 1
0 0 1 0 1 0 0 1 0 1 0 1 0 0 1
0 1 0 0 1
0 1 0 0 1
0 0 1 0 1
13
H = [A Ik ] =
1 0 0 1 0
1 1 1 0 1
(c) Write out the elements of the dual code C . The elements of C are linear
combinations of the rows of H,
= C = {00000, 10010, 11101, 01111}.
15. (Counting codewords ): Prove that it is not possible to find 32 binary words, each of length
8 bits, such that each word differs from every other word in at least 3 places.
Solution: The volume of the space of 8-bit binary words is the number of points,
28 = 256. If no two codewords are within distance 3, then the spheres of radius 1.5
about any two codewords must be disjoint. The volume of a sphere is the number
of points in the sphere. Within distance 1.5 of any 8-bit word are 9 words - the
word itself and those words that differ from it in exactly one bit. Since 256/9 =
28.444, there can be at most 28 disjoint spheres of radius 1.5 (same as spheres of
radius 1), and so there can be at most 28 codewords all of which are distance 3
from each other.
16. (Codeword weight ): The purpose of this exercise is to demonstrate how we can get the
information about the codewords only based on the parity check matrix (without finding
G).
Let C be the binary linear block code whose parity-check matrix H have all the parity check
columns of weight 3:
1 1 1 0 1 1 0 0 1 0 0 0
1 1 0 1 1 0 1 0 0 1 0 0
1 0 1 1 0 1 0 1 0 0 1 0
H=
0
1
1
1
0
0
1
1
0
0
0
1
0 0 0 0 1 1 0 0 1 1 1 1
0 0 0 0 0 0 1 1 1 1 1 1
(a) Find a codeword of minimum weight.
Solution: Note here that any codeword in the code satisfies HcT = 0. The
rightmost 6 columns of H are the complements of the leftmost six columns.
So the sum of the first and last columns is the all-ones vector, as is the sum
of the second and next to last column. The sum of the first two and last two
columns is zero, which corresponds to the following weight 4 codeword:
(1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1).
Another weight 4 codeword, which corresponds to a linear dependence of the
first six columns of H, is
(0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0).
Easy to see there can not be the codewords of less weight (apart from c=0).
14
1
14
18. (Systematic codes): A code C is called systematic on k positions (and the symbols on these
positions are called information symbols) if |C| = q k and there is exactly one codeword for
15
0 0
0 0
H=
0 1
1 0
0 1 0 1
2 0 1 0
2 2 0 0
1 2 0 0
0 0 0 1 0 1
0 0 1 0 2 0
H0 =
0 1 0 0 2 2
1 0 0 0 1 2
Then it is clear that you cannot find 3 or less linearly dependent columns of H.
The code is (6,2,4) over GF(3) and we essentially have no improvement over binary
alphabet, that is there is a (6,2,4) code over GF(2).
Can we find (6,3,4) code over GF(3) ? No, try to remove one row of H and to
preserve independency of 4 vectors for any entries of H. Not possible.
20. (Generator matrices and error correction): Let
matrix,
1 1
H= 0 1
1 0
16
0 1 0 0
1 1 1 0
0 1 1 1
(b) Use syndrome decoding to decode the following received vectors: (i) 110000;
(ii) 000011.
23. Let C and D be linear codes over Fq of the same length. Define:
C + D = {c + d|c C, d D}.
Show that C + D is a linear code and that (C + D) = C D .
First we need to prove that C + D is a linear code. The proof is elementary. Since
0 C, D then clearly 0 C + D. Furthermore if c1 + d1 and c2 + d2 are in C + D
we have to show that
(c1 + d1 ) + (c2 + d2 ) C + D.
But
(c1 + d1 ) + (c2 + d2 ) = (c1 + c2 ) + (d1 + d2 ) = c + d
for some c C and d D as C and D are linear.
We now prove that
(C + D) = C D .
Claim (C + D) C D : Let x (C + D) . Then for all v C + D it
holds that x v = 0.
We show that x C and x D . Let c C. Now,
c=c+0C +D
and therefore x c = 0 and we conclude that x C . Similarly, for all d D,
d=0+dC +D
and therefore x d = 0 and x D .
Claim C D (C + D) : For all
v C D : v c = 0
and v d = 0, for all (c C and d D). Therefore v (c + d) equals zero as well,
and v (C + D) .
24. (Counting the codes): Determine the number of binary linear codes with parameters (n, n
1, 2).
The number of binary linear codes with parameters (n, n 1, 2) is 1 !
To prove this we begin by showing that there exists at most one binary linear
code with the given parameters and conclude by showing the existence of such a
code.
Assume that there exist two binary linear codes C1 , C2 with parameters (n, n
1, 2). Note that for every code with these parameters, if we delete the last coordinate, we obtain all strings of length n 1 (since all codewords differ in at least 2
18
coordinates, deleting the last coordinate results in a set of different strings of the
same size as the original code).
Now, if there exist two different codes C1 , C2 with parameters (n, n 1, 2), then
there exists some v {0, 1}n1 such that (w.lo.g.) v||0 C1 and v||1 C2 . If
v = 0n1 , then C2 is not a linear code since it does not contain the all-zero vector
(recall that all codewords must differ in at least 2 coordinates). Let i be the first
index in v such that vi = 1 (where vi denotes the i-th bit in v). Let v 0 {0, 1}n1
be the binary string that is identical to v in all coordinates except for the i-th
coordinate vi0 = 0. It must hold that v 0 ||1 C1 and v 0 ||0 C2 (why?). Let i0 be
the first index in v 0 such that vi00 = 1. Let v 00 {0, 1}n1 be the binary string that
is identical to v 0 in all coordinates except for the i0 -th coordinate which is equal
to 0. It must hold that v 00 ||0 C1 and v 00 ||1 C2 .
We continue with this procedure until we obtain the all-zero string of length n 1.
It must hold that either C1 contains 0n1 ||1 or C2 contains 0n1 ||1 and therefore
either C1 is not a linear code or C2 is not a linear code.
We now show the set of binary strings with even weight is a linear (n, n1, 2) code.
It is easy to verify that this set constitutes a linear code with minimum distance 2.
Finally we prove that exactly 2n1 strings from the set Fn2 have an even weight.
This is true because the set of strings in Fn2 with even weight can be obtained by
adding a parity bit to each string of length n 1 and there are exactly 2n1 such
strings.
25. (Parity of codewords): Show that in a binary linear code, either all codewords have even
Hamming weight or exactly half of the codewords have even Hamming weight.
Observation 1 Let x and y be two vectors in Fn2 . The Hamming weight of x + y
is even if and only if the Hamming weight of both vectors is even or the Hamming
weight of both vectors id odd.
Let C be a binary linear code and let v1 , . . . , vk be a basis for C (that is, every codeword in C is a linear combination of v1 , . . . , vk ). Now, if the Hamming
weight of all v1 , . . . , vk is even, then by the observation, all codewords in C have
even Hamming weight. Assume there exist t 1 vectors in the basis with odd
Hamming weight. W.l.o.g , we assume that v1 , . . . , vt are the vectors with odd
Hamming weight and vt+1 , . . . , vk are the vectors with even Hamming weight. Every codeword in C is a linear combination of v1 , . . . , vk and therefore can be viewed
as a binary string of length k (where we have 1 in the i-th coordinate if and only
if vi appears in the linear combination). Now, each codeword has Even Hamming
weight if and only if the number of the vectors from v1 , . . . , vt in the combination
is even (why?). That is, a codeword c has even Hamming weight if and only if
the number of 1s in coordinates 1 to t in the binary string that represents the
combination is even. Therefore in order to find the number of codewords with
even Hamming weight, we count the number of binary strings of length k where
the number of 1s in coordinates 1 to t is even. This equals the number of binary
strings of length t with even number of 1s number of binary strings of length
19
k t (recall that we may have every string in the coordinates t + 1 to k). This
equals
2t
2kt = 2kt+t1 = 2k1 .
2
26. (McWilliams Identity): Given is a generator matrix of a (4,2) code C with d = 2
1 0 1 1
G=
1 1 0 1
Find the weight distribution of the dual code C .
The weight distribution of C is obviously given by (1, 0, 1, 2, 0). We need to compute
4
X
WC (x + y, x y) =
Ai (x + y)ni (x y)i
i=0
20
and
x2 + y2 C2
and therefore x + y C.
We now show that d = min{d1 , d2 }. Since every codeword in C is a concatenation
of a codeword form C1 and a codeword from C2 , it is clear that d min{d1 , d2 }.
Now, assume w.lo.g, that d1 d2 . The codeword(c1 ||0) where c1 C1 such that
wt(c1 ) = d1 and 0 is the all-zero codeword in C2 , has weight d1 and therefore
d = min{d1 , d2 }.
29. (Combinatorial bounds I): Prove that Aq (n, d) qAq (n 1; d), where Aq (n, d) is the maximal
number of codewords of length n with distance d over an alphabet of q symbols.
Let C be a code with M = Aq (n, d). We divide the codewords in C into disjoint
sets by the value of the first location in the codeword (that is, all codewords in
the same subset have the same value in their first location).
Now, note that the number of codewords in each subset is at most
Aq (n 1, d).
(If we remove the first location of all codewords in each subset, we obtain a code
with parameters n 1 and d. ) Now, there are at most q subsets and therefore
M = Aq (n, d) qAq (n 1, d).
30. (Combinatorial bounds II): Show that the minimum distance of a perfect code must be odd.
Assume C is a perfect code with even weight d. Recall that a perfect code is defined as an e-error correcting [n, M ] code over alphabet A for which every n-tuple
over A is in the sphere of radius e about some codeword.
We first prove that there exists a word x Fnq such that
d(x, c) d/2c C
21
We conclude that d(x, c) d/2 for all c C. However, this implies that the
spheres of radius
d1
b
c
2
around codewords of C do not contain x and therefore
M Vq (n, b
d1
c) < q n
2
0 1 0 0 1 1 0
1
0 0 0 0 0 0 0
0
0 0 0 0 0 0 0
0
(0100110) = 0 1 0 0 1 1 0 V
1
cT1 c2 =
0 1 0 0 1 1 0
1
0 0 0 0 0 0 0
0
0 0 0 0 0 0 0
Note that if cV 6= 0 then there must be a nonzero row and its weight is 3 since it is a
codeword of a (7,4,3) code. But once there is a single one in some row the number of ones
in the corresponding column must be also 3 (it is the weight of c1 )
We want to devise an ad hoc error-correction method that can correct up to four bit errors
for at least some nice distributions of errors in a 7 7 array of received bits. Consider the
various number of rows that can be affected by up to four errors.
The following ad hoc error-correction procedure is one method for correcting up
to four bit errors in a 7 7 codeword array. Let R be the set of rows in which
errors are detected and let C be the set of columns in which errors are detected
(simply check whether received rows/columns are the codewords of the Hamming
(7,4,3) code). Apply the following correction procedure:
(a) Case 1 |R| |C|. Run through the rows in R. For each row r R, correct r
using the nearest-neighbor error correction for the (7,4) Hamming code.
Then apply error detection to each column of the received array. If any
columns not in C show errors, undo the correction performed on r.
Example:
0
0
0
0
0
0
0
1
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
1
1
0
0
1
0
0
1
1
0
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
1
0
0
1
1
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
1
1
0
0
1
0
0
1
1
0
0
0
0
0
0
0
0
0
In this case |R| = 4 and |C| = 2. The errors are detected, and then any single
error in each row is corrected using Hamming code correction so that
1 1 0 0 1 1 0
0 1 0 0 1 1 0
0 0 0 0 0 0 0
1 0 0 0 0 0 0
1 0 0 0 0 0 0
0 0 0 0 0 0 0
0 1 0 0 1 1 0 0 1 0 0 1 1 0
0 1 0 0 1 1 0
0 1 0 0 1 1 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 1 0 0 0 0 0
0 0 0 0 0 0 0
Case 2: |R| < |C|. Proceed as above, with the roles of rows and columns
interchanged.
(b) Update R and C. If both are now empty, stop. (in the above example we are
done - no more errors either in rows or in columns)
(c) But it might be the case that we miscorrect some columns or rows so we are
not done in the first step. We need to undo the correction of such a row.
23
Example:
1
1
1
0
0
0
0
1
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
1
1
0
0
1
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
1
1
0
0
1
0
0
1
1
0
0
0
0
0
0
0
0
0
The green colour indicates a new error introduced by error correction. Since
this column was not detected to have errors originally we must undo error
correction of this row. Now we proceed with error correction of the columns,
0 1 0 0 1 1 0
0 1 0 0 1 1 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
1 1 0 0 0 0 0
0 0 0 0 0 0 0
0 1 0 0 1 1 0 0 1 0 0 1 1 0
0 1 0 0 1 1 0
0 1 0 0 1 1 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
There are a few more complicated patterns that might be discussed but we
stop here.
33. Show that a perfect binary [n, M, 7] code has n = 7 or n = 23.
The Sphere Packing Bound for a binary [n, M, 7] code says that:
n
n
n
n
M
+
+
+
= 2n .
0
1
2
3
Thus,
n
n
n
n
+
+
+
= 2r .
0
1
2
3
Rewriting this we have,
1
1
1 + n + n(n 1) + n(n 1)(n 2) = 2r ,
2
6
6(n + 1) + 3n(n 1) + n(n 1)(n 2) = 3 2r+1 ,
6(n + 1) + n(n 1){3 + (n 2)} = 3 2r+1 ,
(n + 1){n2 n + 6} = 3 2r+1 ,
(n + 1){(n + 1)2 3(n + 1) + 8} = 3 2r+1 . ()
A bit of number theory implies that n + 1 must be divisible by some power of 2.
First note that writing (*) modulo 8 we have,
(n + 1){(n + 1)2 3(n + 1)} 0
(mod 8),
16. In general the second term on the LHS is then of the form (2k + 1)8 for k 0.
If it is 8 (k = 0), then
(n + 1)2 3(n + 1) = 0,
which is impossible, since n 7. If it is 24 (k = 1), then
(n + 1)2 3(n + 1) 16 = 0,
which is also impossible, as the discriminant is 73.
Therefore, n + 1 divides 24 (as we have 3 on the RHS), so that n = 7, 11, 23.
Now, n = 11 does not satisfy Equation (*). So, n = 7 or 23. In fact, perfect
codes of these lengths exist, the repetition code of length 7 and the Golay code,
respectively.
34. (Bounds): Use the Sphere Packing Bound to find an upper bound for M of a binary [5, M, 3]
code.
For q = 2, the Sphere Packing Bound for an [n, M, 2e + 1] code is,
n
n
n
M
+
+ ... +
2n .
0
1
e
For n = 5, d = 3, e = 1 we get,
M
5
1+
32,
1
therefore M 5.
35. (Bounds II): The previous exercise gives an upper bound for A2 (5, 3). Now, show by construction that A2 (5, 3) = 4.
The idea is to exhaustively find the maximal number of the codewords. Choose
two words in C as a1 = 00000, and w.l.o.g. a2 = 11100. Since d(x, a1 ) 3 for
any x in C the only other possible elements of C are the 9 words with three 1s,
apart from a2 , the 5 words with four 1s, and u = 11111. As d(u, a2 ) = 2, we have
u 6 C.
wt(c) = 3: 11010, 11001, 10110, 10101, 10011, 01110, 01101, 01011, 00111;
wt(c) = 4: 11110, 11101, 11011, 10111, 01111.
The words with three 1s and two of the first three coordinates 1 are at distance 2
from a2 . This leaves
b1 = 10011,
b2 = 01011,
b3 = 00111.
Similarly, the first two words with four 1s are at distance 1 from a2 . This leaves
c1 = 11011,
c2 = 10111,
c3 = 01111.
Now, d(bi , bj ) = 2, d(ci , cj ) = 2 for i 6= j. So there can only be one bi and one cj
in C. Hence |C| 4.
In fact, taking b1 C, the only possibility is c3 . This gives C = {a1 , a2 , b1 , c3 } as
a [5, 4, 3] code.
25
36. (Even weight codes): Show that if there is a binary [n, M, d] code with d even then there is
an [n, M, d] code in which all codewords have even weight.
Let C be an [n, M, d] code with d even. Then we can puncture C to get an [n
1, M, d 1] code C 0 (assuming no coordinate is zero for all codewords).
From this
Pn1
0
code we can get C as the extended code by, calculating cn = i=1 ci (mod 2).
The extended code C 0 is again [(n, M, d)] but all the codewords have even weight.
As an example consider,
1 1 1 1
1 1 1 0
1 1 1
1 0 0 1
1 0 0 1 C0 =
1 0 0 C=
C=
0 0 0 0
0 0 0 0
0 0 0
37. (Extended code): Let C be a binary code of length n. Form a binary code C 0 of length n + 1
as follows:
x = x1 x2 . . . xn C x0 = x1 x2 . . . xn xn+1 C 0
where
xn+1 =
1 if w(x) is odd
0 if w(x) is even
Show that, if C is linear, then C 0 is also linear; it is called the extended code.
Let
C0 = {x1 x2 . . . xn+1 V (n + 1, 2)|x1 x2 . . . xn C; xn+1 F2 },
C1 = {x1 x2 . . . xn+1 V (n + 1, 2)|x1 + x2 + . . . + xn + xn+1 = 0}.
Then C0 and C1 are both subspaces of V (n + 1, 2), and C 0 = C0 C1 . Hence C 0
is a subspace.
Now for x, y V (n, 2),
n
n
n
X
X
X
w(x + y) =
(xi + yi ) =
xi +
yi = w(x) + w(y)
i=1
i=1
(mod 2).
i=1
26
Notice that
d(x0 , y0 ) d(x, y).
Hence, it suffices to consider the case
d(x, y) = 2t + 1.
27
No, we show the following counter-example: Let C be the binary linear code
containing all vectors of length 3 with even Hamming weight. This is a (3, 2, 2)
code. If the claim is true, then there must exist a (4, 2, 3) code. Take simply
any two vectors of weight 3 in F42 , say (1110) and (0111). Clearly we cannot get
distance 3 for any choice of the basis.
40. (Concatenation of codes):
41. Let C1 and C2 be two linear codes over Fq . Show that C = {(c1 ||c2 ) : c1 C1 , c2 C2 }
(where || stands for concatenations) is a linear code with d = min{d1 , d2 }.
We first show that C is a linear code. Let x, y C and , Fq . We show
that x + y C. From the definition of C: x = x1 ||x2 and y = y1 ||y2 where
x1 , y1 C1 and x2 , y2 C2 . Now,
x + y = (x1 ||x2 ) + (y1 ||y2 ) = x1 + y1 ||x2 + y2
, where by the linearity of C1 and C2 , x1 + y1 C1 and x2 + y2 C2 and
therefore x + y C.
We now show that d = min{d1 , d2 }. Since every codeword in C is a concatenation
of a codeword form C1 and a codeword from C2 , it is clear that d min{d1 , d2 }.
Now, assume w.lo.g, that d1 d2 . The codeword(c1 ||0) where c1 C1 such that
wt(c1 ) = d1 and 0 is the all-zero codeword in C2 , has weight d1 and therefore
d = min{d1 , d2 }.
42. Hadamard matrices: Recall that an n n matrix H all of whose entries are from {+1, 1}
is a Hadamard matrix if H H T = n I where the matrix product is over the reals and I is
the n n identity matrix.
(a) Show that if there is an n n Hadamard matrix then n is either 1 or 2 or a multiple
of 4.
Let a, b, c be three distinct rows of a Hadamard matrix. (So we are assuming
n 3.) For i, j {1, 1}, let
Si,j = {k|ak = i bk and ak = j ck }.
Let = |S1,1 |, = |S1,1 |, = |S1,1 |, and = |S1,1 |. Then + counts
the number of coordinates where a equals b and so + = n/2. Similarly +
counts the number of coordinates where a equals c and so + = n/2. Finally,
+ counts the number of coordinates where b equals c and so + = n/2.
And of course, + + + = n. Solving the 4 4 linear system above, we get
= = = = n/4. Since each is an integer, we have n must be a multiple
of 4.
(b) Given an n n Hadamard matrix Hn and an m m Hadamard matrix Hm , construct
an (nm) (nm) Hadamard matrix.
Let F be any field (say rationals, for this problem). For vectors a Fn and
b Fm , let a b Fnm denote their outer product (aka tensor product),
28
We show how to use tensor products to build a big Hadamard matrix from
two smaller ones.
Let u1 . . . , un be the rows of Hn and let v1 . . . , vm be the rows of Hm . By the
condition Hn HnT = nI, we have hui , uj i = 0 if i 6= j. (Similarly for the vi s.)
Let Hnm be the matrix whose rows are ui vj for all i [n], j [m]. As noted
T are
above, this is a +1/ 1 matrix. Thus the diagonal entries of Hnm Hnm
T
all nm as required. Now consider the off-diagonal entry (Hnm Hnm )(ij),(kl) =
hui vj , uk vl i = hui , uk i hvj , vl i. Since at least one of the conditions
i 6= k or j 6= l holds, we have the above inner product is zero. This proves the
off-diagonal entries are zero as required.
43. Suppose C is a code of length n over the q-ary alphabet A. Let w, x C, w 6= x, and
v An . In terms of these vectors answer the following (and generalize):
(a) What does it mean to say that C is t-error-detecting? What does it mean to say that C
is t-error-correcting? Prove that if C is 2t-error-detecting, then C is t-error-correcting.
Hint: Use the triangle inequality and show that if C is not t-error correcting then it is
not 2t-error detecting.
C is t-error-detecting if there do not exist words w, x C with d(w, x) t. C
is t-error-correcting if there do not exist words v An and w, x C such that
w 6= x and
d(v, w) t, d(v, x) t.
Suppose C fails to be t-error-correcting. Then there are v, w, x as above. By
the triangle inequality, we have
d(w, x) d(v, w) + d(v, x) t + t = 2t,
and so C is not 2t-error-detecting. Hence if C is 2t-error-detecting, then it is
t-error-correcting.
(b) Show that if C is t-error-correcting, then
|C|
n
0
+ (q 1)
n
1
qn
.
+ (q 1)2 n2 + . . . + (q 1)t nt
X n
xC
0
29
n
n
t
+
(q 1) + . . . +
(q 1) .
1
t
n
0
Thus,
2k
2n
n
n =
2
2n
1+n+
n(n1)
2
2n
2n
nl+1
<
.
1
1 l =2
2)
(2
+
n
+
n
(2
)
2
2
Thus, k < n l + 1.
44. This problem concerns the bounds on codes.
(a) Let B be an alphabet of size q and C B n be a q-ary block code of length n.
(i) Define (mathematically) the Hamming distance d on B n .
For x, y B n we define,
d(x, y) = #{xi 6= yi ; i = 1, . . . , n}.
(ii) Define the minimum distance d(C) of the code C.
d(C) =
min
x,yC;x6=y
d(x, y).
(b) Let the parameter Aq (n, d) define the maximum number of codewords of length n over
B such that the Hamming distance between any two codewords is d. State and prove
the sphere-packing bound for Aq (n, d).
Hint: How many disjoint spheres of suitable radius can be packed in the space.
Since d(C) = d we know that C is t = d1
2 error-correcting code. The bound
is,
Aq (n, d) q n /|S(x, t)|.
In the class a different notation was used Aq (n, d) q n /Vq (n, t), so obviously
Vq (n, t) = |S(x, t)|. Note that
n
n
2 n
t n
Vq (n, t) =
+ (q 1)
+ (q 1)
+ . . . + (q 1)
,
0
1
2
t
hence the bound is the same as in Problem 1.
30
1 0 0
0 1 0
G=
0 0 1
matrix
1 0 1
1 1 0
0 1 1
(a) Write down a parity check matrix H for C. Explain how the minimum distance of C
may be deduced from H. Find d(C).
1 1 0 1 0 0
H= 0 1 1 0 1 0
1 0 1 0 0 1
000
101
110
011
100
010
001
111
S(100110) = 011 so 100110 lies in 001000 + C. Hence we correct it by subtracting 1 from its 3rd digit: 100110 101110.
S(011101) = 000 so 011101 is a codeword and needs no correction.
S(101001) = 111 so 101001 lies in 100010 + C. Hence we correct it 101001
101001100010 = 001011. Note that your answer will be different if you chose
a different weight 2 coset leader.
(d) Puncturing the code means deleting some coordinates of the code. Discuss the effect
of puncturing on the minimum distance and the rate of the code.
Puncturing may or may not decrease the minimum distance, hence d0 d.
Anyway, since n0 < n and the dimension is the same it means that rate R =
k/n0 is increased.
46. A (6, 3) linear block code C over GF(2) is defined by the following parity check matrix,
1 0 0 0 1 1
H= 0 1 0 1 0 1
0 0 1 1 1 0
(a) Find the generator matrix of C.
The parity check matrix is simply obtained
0 1
T
1 0
G = [A I3 ] =
1 1
32
from H as,
1 1 0 0
1 0 1 0
0 0 0 1
(b) The parity check matrix H does not allow the presence of the codewords of weight < 3
(apart from the all zero codeword). Explain why ?
We cannot have a codeword of weight 2 since then HcT = 0 is not satisfied
due to the properties of H that no 2 columns of H are linearly dependent.
(c) Suppose that the code is used for error detection only over a binary symmetric channel
with error rate p = 103 . Find the probability of undetected error.
Hint: W.l.o.g. assume that all zero codeword was transmitted. Be careful with the
interpretation of undetected error (for Pavel only : of course the error must be a
codeword)
An undetected error occurs if and only if the error pattern is a codeword !
The weight distribution of the code is (1, 0, 0, 4, 3, 0, 0), and there are 4
possibilities that an error of weight 3 and 3 possibilities that an error of weight
4 goes undetected.Therefore,
Pe = 4p3 (1 p)3 + 3p4 (1 p)2
= 4 109 (0.999)3 + 3 1012 (0.999)2 4 10 9.
(d) Suppose that the code is used for erasure correction over a binary erasure channel with
erasure probability = 102 . How many erasures the code can always correct ?
Since d = 3 the code can always correct 2 erasures.
(e) Decode the received word (
110)
Determining erasure values corresponds to solving a system of 3 equations.
Thus we need linear independency of these equations which comes from the
independency of the columns of H. Denoting the first 3 positions by r1 , r2 , r3
from HcT = 0 we get,
r1 = 1
r2 = 1
r3 = 0
Thus, r c = (110110).
(f) For the erasure probability = 102 find the probability of decoder failure, that is, the
probability that the transmitted codeword cannot be determined from the unerased
bits (for sufficiently many erasures)
Hint: One erasure weight need to be carefully investigated.
Since any 4 columns or more of H are linearly dependent we cannot correct 4 or
more erasures. However, some erasures of weight 3 can be corrected as in (e) but
those submatrices of H of size 33 whose columns are not linearly independent are
those erasure patterns that cannot be corrected. The number of such submatrices
is 4 (by inspection of H) and therefore,
6 4
6 5
6 6
3
3
2
P f ail = 4 (1 ) +
(1 ) +
(1 ) +
=
4
5
6
= 4 106 (0.99)3 + 15 108 (0.99)2 + 6 1010 (0.99) + 1012 =
4 106 (0.99)3 + 15 108 (0.99)2 = 3.9 106 .
33
47. This question considers the bounds on codes. Actually, the results in this problem establish
the so-called Plotkin bound.
(a) What is meant by a binary [n, M, d]-code, i.e. explain the notation ?
An [n, M, d]-code C is a code with M codewords over a binary alphabet, all of
length n, such that d(u, v) d for all distinct u, v C.
(b) Suppose C is a binary
[n, M, d]-code. Regard the codewords as vectors over GF (2),
and define an M
n
binary
matrix D as follows:
2
The rows of D correspond to all (unordered) pair of codewords in C. The row corresponding to codewords u and v is simply the vector (modulo 2 bitwise) sum of u and
v. (The ordering of the rows of D is not significant.) Write down the array D for the
particular code
C = {000000, 001111, 111001, 110110}.
D=
0
1
1
1
1
0
0
1
1
1
1
0
1
1
0
0
1
1
1
0
1
1
0
1
1
0
1
1
0
1
1
1
0
0
1
1
34