07 Introduction
07 Introduction
07 Introduction
IKTEODUCnON
CHAPTER 0
INTRODUCTION
Clearly, the rise of the internet (among others) has made information available
widely. This can and has been compared to the introduction of the printing
press in the middle Ages. With its advent, the massive distribution of books
and ideas became possible, and the printing press certainly has played a
significant role since its invention. Even while it is still much too early to tell,
the rise of the internet seems to be of a similar scale for the first time in history,
made nearly zero for all communication rates below channel capacity. The
capacity can be computed simply from the noise characteristics of the channel.
Shannon further argued that random processes such as music and speech have an
thermodynamics, and argued that if the entropy of the source is less than the
Infomiation theory today represents the extreme points of the set of all possible
ideas. All data compression schemes require description rates at least equal to this
minimum. At the other extreme is the data transmission maximum/(Z; 7), known
as the channel capacity. Thus, all modulation schemes and data compression
schemes lie between these limits. Information theory also suggests means of
simple modulation and demodulation schemes that we use them rather than the
of the channel capacity theorem. Progress in integrated circuits and code design
of interference and noise. Some of the trade-offs of rates between senders and
Introduction
Kolmogorov (1956), Chaitin (1966) and Solomonoff (1964) put forth the idea
that the complexity of a string of data can be defined by the length of the
shortest binary computer program for computing the string. Thus, the
importance. Thus, Kolmogorov complexity lays the foundation for the theory
the ultimate data compression and leads to a logically consistent procedure for
inference.
and computational complexity focuses on minimizing along the first axis. Little
turn, they characterize the behavior of long sequences of random variables and
allow us to estimate the probabilities of rare events (large deviation theory) and
Coding theory originated with the 1948 publication of the paper "A mathematical
theory of communication" by Claude Shannon. For the past half century, coding
Modem information and communication systems are based on the reliable and
have to be employed such that errors within the transmitted information can be
suitable coding schemes for error detection and error correction. Besides good
code characteristics with respect to the number of errors that can be detected or
The subject of coding is the detection and correction of errors in digital information.
Such errors almost inevitably occur after the transmission, storage or processing of
digital information with a suitable error-control code enables the efficient detection
Error-control codes are now used in almost the entire range of information
and optical devices and systems have enabled the implementation of very
new types of code, and new decoding methods, have recently been developed
In the 50 years since Shannon's seminal papers of 1948 and 1949, coding
theory has progressed rather fitfully through periods of euphoric highs with
Introduction
coding application would never move beyond hard decision Hamming and
Golay codes. It now appears safe to say that coding is maturing into an
Certain of the reasons for the increasing applications of coding are external: the
Also, there are many cases in which (a form of) confidentiality is required. An
But there are also other concerns: for instance, one could want to be sure of
whether a certain (electronic) letter actually comes from the mentioned author.
In daily life, the author can achieve this by writing his signature on the letter.
But how can one do that in an email? Another, more mundane example is
latroductioa
getting money from an ATM. Before handing you money, the bank wants to be
sure that there is enough money in your account. On the other hand, you would
like to be the only person able to withdraw money from your account. Again,
going to a bank teller and using handwritten signatures has been the solution
for centuries. But this is very difficult, if not impossible for an automated
machine, and so other, intrinsically digital methods must be adopted. The tools
secrets. Both coding theory and cryptography have been already proven to be
essential in our information age. While they may seem to achieve opposite
goals at first sight, they share much more than that. The aim of cryptography is
persons can communicate in a way that guarantees that the desired subset of the
(ii) Data integrity. This service will provide a means to check if the
transmitted information was altered in any way, including but not limited
(iii) Authentication. This service will establish some identity pertaining to the
message. Thus, this primitive can (among others) be used to guarantee the
Introduction
sense that (up to a certain number of) errors that occurred during the
The initial questions treated by information theory lay in the areas of data
compression and transmission. The answers are quantities such as entropy and
communication. His measure was essentially the logarithm of the alphabet size.
Shannon (1948) was the first to define entropy and mutual information defined as:
Relative entropy was first defined by Kullback and Leibler (1951). It is known
Consider two random variables X and Y with a joint probability mass function
p(x, y) and marginal probability mass functions p(x) and p(y). The mutual
information I (X; Y) is the relative entropy between the joint distribution and
(1959) given as
I(X',Y)=1(Y;X) (0.2.8)
10
Introduction
It has been shown by Kannappan and Rathie (1973) that KuUback's (1959)
small, the P probabilities of events are usually close to Q probabilities of the same
events. While if I (P: Q) is large the P probabilities and the Q probabilities of the
same events are quite different. Thus I (P: Q) acquires a natural interpretation in
11
Introduction
distribution.
their generalizations and some measures involving more than two distributions,
different defmition and some of the properties are pointed out and characterization
12
Introduction
improvement as
lA^f-Vy-l
i;(P;Q;R)=^ j3^\,P,Q,ReA„ (0.3.3)
2^-'-l
Aczel and Nath (1972) proposed the following two measures of generalized
directed-divergences in information
(0.3.4)
UP.
and
a-l
1 ^^'
a^l,P,Q,ReA„ (0.3.5)
a-\
IA
(=1
Vinocha and Goyal (1980, 1981, 1982) defined new generalizations of Theil's
m-2 n
k+\
4[^J = IZ^.l0g2-^ (0.3.6)
*=i (=1 ' iM2
n p m-\
m-\
^:[^*]=E^..iog2T^- -'ZP.MP,,
/>i k-l
m-\ fn^
I p
K[PkhlAP^/Pr/P.]-Z"r, • ' l
(0.3.7)
k-2 yPkj
13
Introduction
2^-'-l
•;J3^i (0.3.8)
the right hand side of (0.3.7) is the raformation improvement of three distribution
nfi-a
m-2 n
' /,*+!
-(m-2)
' lM2
iriPA= 2^-'-l
•;a=?^l;yf?9il (0.3.9)
" - p-\
n m-\
IP. -1
/-I *=2
#f
CAP^h 2P-1 fi^\ (0.3.10)
-\a-\
n m-1
/r[^j=-^iog ZP. (0.3.11)
;=1 kf2
14
latroduction
where
i"l
m-1
M,{P,\ =(!>-' I/(^..) (l>
1=1 i,m
(0.3.13)
Where
m-2 n p l]
1 1 P<: fiP,.k) <l>
(0.3.15)
^ J.-*
N,[PJ,-<I> 2^-" -\
Vinocha and Faruq (2000) have further generalized Taneja and Tuteja (1984)
(0.3.16)
(=1 "i,m *=2
pace in the 20th century, especially in the second half. A significant part of this
reliably transmit information? Shannon also gave a basic answer: coding can
15
Introduction
do it. Since that time the problem of finding practical coding schemes that
approach the fundamental limits established by Shannon has been at the heart
have taken place that rings us close to answering this question. Perhaps, at least
The advance came with a fundamental paradigm shift m the area of coding that
took place in the early 1990s. In Modem Coding theory, codes are viewed as
The local interactions of the code bits are simple but the overall code is
These are exciting times for coding theorists and practitioners. Despite all the
progress made, many fundamental questions are still open. Sparse graphical
models and message-passing algorithms, to name just two of the notions that
otherfieldsas well. This is not a coincidence. Many of the innovations were brought
Modem coding will not displace classical coding anytime soon. At any point in
time hundreds of millions of Reed-Solomon codes work hard to make your life
16
Introduction
less error prone. This is unlikely to change substantially in the near future. But
modem coding.
we want to transmit a message across a noisy channel so that the receiver can
determine this message with high probability despite the imperfections of the
and allow reliable transmission close to the ultimate limit, the Shannon capacity.
A source transmits its information (speech, audio, data, etc.) via a noisy
channel (phone line, optical link, wireless, storage medium, etc.) to a sink. We
information with as little distortion (number of wrong bits, mean squared error
Shannon (1948) formalized the communications problem and showed that the
shown in Figure (0.4.2). First, a source encoder transforms the source into a bit
17
Introduction
stream. Ideally, the source encoder removes all redundancy from the source so
that the resulting bit stream has the smallest possible number of bits while still
representing the source with enough accuracy. The channel encoder then
Source Sink
ik
1 '
Source Source
Encoder Decoder
ii
^'
Channel Channel
Channel
Encoaer Decoaer
distortion measure reflects the "cost" of deviating from the original source
output. If the source emits points in R" it might be natural to consider the
squared Euclidean distance, whereas if the source emits binary strings a more
natural measure might be to count the number of positions in which the source
output and the word that can be reconstructed from the encoded source differ.
18
Introduction
Shannon's source coding theorem asserts that, for a given source and distortion
measure, there exists a minimum rate R = R(d) (bits per emitted source
distortion not exceeding d. The plot of this rate R as a function of the distortion
amount of redundancy is added to these source bits to protect them against the
errors in the channel. This process is called channel coding. Richardson and
Urbanke (2008) modeled the channel as a probabilistic mapping and they are
typically interested in the average performance, where the average is taken over
existence of a maximum rate (bits per channel use) at which information can be
channel. This maximum rate is called the capacity of the chaimel and is
denoted by C. At the receiver they first decode the received bits to determme
the transmitted information. Authors then use the decoded bits to reconstruct
receiver if R(d) < C, i.e., if the rate required to represent the given source with
the allowed distortion is smaller than the capacity of the channel. Conversely,
no scheme can do better. One great benefit of the separation theorem is that a
communications link can be used for a large variety of sources: one good
channel coding solution can be used with any source. Virtually all systems in
19
Introduction
use today are based on this principle. It is important though to be aware of the
terms of the achievable distortion when large blocks of data are encoded together.
Joint schemes can be substantially better in terms of complexity or delay. Also, the
We will not be concerned with the source coding problem or, equivalently, we
assume thattiiesource coding problem has been solved. For us, the source emits a
sequence of independent identically distributed (iid) bits which are equally likely to
be zero or one. Under this assumption, we will see how to accomplish the channel
positive rate? At some level we have already given the answer: add redundancy
to the message that can be exploited to combat the distortion introduced by the
chaimel. By starting with a special case, following are the key concepts.
flipped, the latter occurring with probability e, and different bits areflippedor
?0
Introduction
generality.
-1 1- e -1
Figure 0.4.3: BSC(e)
hard decisions are made at the front end of the ;"eceiver, i.e., where the received
Fh^t Trial: Suppose that the transmitted bits are independent and
over the BSC(6). Thus, we send the source bits across the channel as is, without
the insertion of redundant bits. At the receiver we estimate the transmitted bit X
based on the observation Y. The decision rule that minimizes the bit-error
Bayes's rule shows that this is equivalent to maximizing Py^^{y \ x) for the given y.
21
Introduction
probability that the estimate differs from the true value, i.e., P^ = P{x'^'^''{Y) * X},
is equal to G . Since for every information bit we want to convey we send exactly
one bit over the channel we say that this scheme has rate l.We conclude that with
Second Trial: If the error probability G is too high for our application, what
transmission strategy can we use to lower it? The simplest strategy is repetition
coding. Assume we repeat each bit k times. To keep things simple, assume that
k is odd. So if X, the bit to be transmitted, has value x then the input to the
Yi,Y2^....,Yi^. It is intuitive, and not hard to prove, that the estimator that
= 2]r]e'(l-€)*-'. (0.4.2)
i>kl2 V
Since for every information bit we want to convey we send k bits over the
channel we say that such a scheme has rate \lk. So with repetition codes we
^
can achieve the (rate, i'^)-pairs -.J] e'(l-e)^-' . For Pf, to
]^'L^i>kii
v'y
-I
22
Introduction
approach zero we have to choose k larger and larger and as a consequence the
fields to represent it. The most important instance for us is the binary field
of) bits, a natural representation and convenient for the purpose of processing.
If you are not familiar with finite fields, very little is lost if you replace any
M elementsfi-omF", i.e.,
The elements of the code are called codewords. The parameter n is called the
block length.
23
Introduction
In the preceding definition we have introduced binary codes, i.e., codes whose
think of the two field elements as {±1} instead (see, the definition of the BSC).
make no distinction between these two cases and talk about binary codes and
symbols per transmitted symbol. For Example in Repetition Code let F = GF(2).
information symbol.
Support Set: The support set of a codeword xeC is the set of locations
say that a codeword jc e C is minimal if its support set does not contain the
The Hamming distance introduced in the following definition and the derived
24
Introduction
Hamming Weight and Hamming Distance: Let u,veF". The Hamming weight
in u, i.e., the cardinality of the support set. The Hamming distance of a pair (u, v),
which we denote by d(u, v), is the number of positions in which u differs from v.
d(u,v)>0, with equality if and only if u = v. Also, d(u,v) satisfies the triangle
inequality
mathematical sense.
defined as
called an (n, k) linear block code if and only if its 2 codewords form a k-
dimensional subspace of the vector space V of all the n-tuples over GF(2).
25
Introduction
vector space of all the n-tuples over GF(2), there exist k linearly independent
The codeword v = (vo,Vi, v„_j) for this message is given by the following linear
G = g\ =
^1,0 ^u • •• ^1,«-1
(0.5.2)
v = u.G (0.5.3)
rows of matrix G with the information bits in the message u as the coefiBcients.
26
Introduction
linear block code has more than one basis. Consequently, a generator matrix of
a given (n, k) linear block code is not unique. Any choice of a basis of C gives
Since a binary (n, k) linear block code C is a k-dimensional subspace of the vector
space V of all the n-tuples over GF(2), its null (or dual) space, denoted Q , is an
binary (n, n - k) linear block code and is called the dual code of C. Let B^ be a
\ ^,0 Ki k),nA
H=
^,0 \i Kn-l
I (0.5.5)
Then H is a generator matrix of the dual code Q of the binary (n, k) linear
block code C. It follows from (0.5.2), (0.5.4), and (0.5.5) thatGH^ =0, where
27
Introduction
C = {v€F:vi/^=0} (0.5.6)
block code is based on a generator matrix of the code using (0.5.3) and
rank matrix if its rank is equal to the number of rows of H. However, in many
cases, a parity-check matrix of an (n, k) linear block code is not given as a full-
rank matrix, i.e., the number of its rows is greater than its row rank, n - k. In
this case, some rows of the given parity-check matrix H are linear
A widely used class of linear block codes is the Hamming code family
[Hamming (1950)]. For any positive integer m > 3, there exists a Hamming
Length n = 2m - 1
28
Introduction
The parity check matrix H of these codes is formed of the non-zero columns of
ii = [lJQ] (0.5.7)
or more.
„ = 2^-l = 7
yt = 2 ^ - 3 - l = 4
n-A; = m = 3
1 0 0 1 0 11
/f = 0 1 0 1 1 1 0
0 0 1 0 1 1 1
which is the linear block code C(7, 4) that has been analyzed previously. The
generator matrix can be constructed using the following expression for linear
In the parity check matrix H, the sum of three columns can result in the all-zero
vector, and it is not possible for the sum of two columns to give the same
29
Introduction
result, and so the minimum distance of the code is d^^^ = 3. This means that
they can be used for correcting any error pattern of one error, or detecting any error
pattem of up to two errors. In this case there are also 2'" - 1 correctable error
patterns and on the other hand there exist 2"" cosets, so that the number of
possible correctable error patterns is the same as the number of different cosets
(syndrome vectors). The codes with this characteristic are called perfect codes.
v^'^=(VbVo,v„....,v„_2) (0.5.9)
Cyclic codes form a very special type of linear block codes. They have encoding
advantage over many other types of linear block codes. Encoding of this type of
Many classes of cycUc codes with large minimum distances have been constructed;
these classes of cyclic codes have been developed. To analyze the structural
30
Introduction
coefficients as follows:
v = Vo + ViJi'+....+v„_,X''"^ (0.5.10)
corresponding to the all-zero codeword is the zero polynomial. All the other
following without proofs. Berlekamp (1984), Blauhut (2003), Blake and muUin
(1975), Clark and Cain (1981), Lin and Costello (2004), Macwilliams and
Sloane (1977), Peterson and Weldon (1972) contain good and extensive
In an (n, k) cyclic code C, every nonzero code polynomial has degree at least n
- k but not greater than n - 1. There exists one and only one code polynomial
31
Introduction
is called the generator polynomial of the (n, k) cyclic code C. The degree of
g(X) is simply the number of parity-check bits of the code. Since each code
1 ^1 ^2 • •• gn-k-l 1 0 0 •• 0 0
0 1 gi • •• Sn-k-2 8n-k-l 1 0 •• 0 0
G= (0.5.13)
0 0 0 ••• 1 gi g2 g3 - g„.k-l 1.
generator polynomial g(X) as the first row and its k - 1 right cyclic-shifts as
the other k - 1 rows. G is not in systematic form but can be put into systematic
32
Introduction
The binary Reed-MuUer codes were first constructed and explored by MuUer
(1954), and a majority logic decoding algorithm for them was described by
Reed (1954). Although their minimum distance is relatively small, they are of
practical importance because of the ease with which they can be implemented
with finite affine and projective geometries; [Assumus and Key (1998)]. These
Let m be a positive integer and r a nonnegative integer with r < m. The binary codes
we construct will have length 2"". For each length there will be m + 1 linear
codes, denoted R(r, m) and called the r order Reed-Muller, or RM, code of
length 2". The codes R(0,m) and R(m, m) are trivial codes: the 0th order RM
code R(0,m) is the binary repetition code of length 2'" with basis {1}, and the
m"" order RM code R(m, m) is the entire space GF(2f'". For 1 < r < m, define
Let G(0,m) = [1 1 • • • 1] and G(m,m) = I^„ . From the above description, these
are generator matrices for R(0,m) and R(m, m), respectively. For 1 < r < m, a
G(r,m-\) G{r,m-\)
G{r,m) = (0.5.15)
0 G(r-l,w-l)
33
Introduction
The generator matrices for R(r, m) with l < r < m < 3 a r e constructed as follows:
10 1 0 10 10
1 0 1 O" 0 10 1 0 10 1
G(l,2) = 0 1 0 1 G(l,3) =
0 0 11 0 0 11
0 0 1 1
0 0 0 0 1111
'l 0 0 0 10 0 0
0 1 0 0 0 10 0
0 0 1 0 0 0 10
and G(l,3) = 0 0 0 1 0 0 01
0 0 0 0 10 10
0 0 0 0 0 10 1
0 0 0 0 0 0 11
From these matrices, notice that R(l, 2) and R(2, 3) are both the set of all even
weight vectors in GF{T)^ and GF{2f , respectively. Notice also that R(l, 3) is
The dimension, minimum weight, and duals of the binary Reed-MuUer codes
Properties: Let r be an integer with 0 < r < m. Then the following hold:
(m ^m^
(ii) The dimension of R(r, m) equals + +
\f J
34
Introduction
0.6.1 The Binary Golay Code: The binary form of the Golay code is one of the
correcting code can correct a maximum oft errors. A perfect t-error correcting code
has the property that every word lies within a distance oft to exactly one codeword.
Equivalently, the code has d^^^=2t-\-\, and covering radius t, where tiie
covering radius r is the smallest number such that every word lies within a distance
of r to a codeword.
The inequality in (0.6.1) is known as the Hamming bound. Clearly, a code is perfect
precisely when it attains equality in the Hamming bound. Two Golay codes do
attain equality, making them perfect codes: the [23, 12] binary code
withi/njjn = 7 , and the (11, 6) ternary code witht/min =5. Both codes have the
largest minimum distance for any known code with the same values of n and k.
+ = 2^'=2^3-12 (0.6.2)
0 j li J v2y v 3 ;
which indicated the possible existence of a [23, 12] perfect binary code that could
correct up to three errors. Golay (1949) discovered such a perfect code, and it is
35
Introduction
the only one known capable of correcting any combination of three or fewer
random errors in a block of 23 elements. This [23, 12] Golay code can be
factors of x" -1 over GFiq) [MacWilliams and Sloane (1977)]. Let g,(x), g^ix)
= (X-I)gl(x)g2(x)
Both the polynomials gj(x) and g^ix) generate the [23, 12, 7] Golay code. Let
by giix)- It can be easily observe that both the polynomials g^{x) and g^{x)
There are several different ways to decode the [23, 12] binary Golay code that
refined error trapping schemes: the Kasami Decoder, and the Systematic
Search Decoder. Both are explained by Lin and Costello (1983).There is also
other systems of decoding, but are not as good because some of the error-
36
Introduction
Let C be any [n, k] code whose minimum distance is odd. We can obtain a
new (n+1, k) code C with the new minimum distance i/^„ = J^„ +1 by adding
a 0 at the end of each code word of even weight and a 1 at the end of each code
word of odd weight. This process is called adding an overall parity check or
extension of the code. The [23, 12] Golay code can be extended by adding an
overall parity check to each codeword to form the [24, 12] extended Golay
0 1 1 1 1 1 1 1 1 1 1 1
1 1 1 0 1 1 1 0 0 0 1 0
1 1 0 1 1 1 0 0 0 1 0 1
1 0 1 1 1 0 0 0 1 0 1 1
1 1 1 1 0 0 0 1 0 1 1 0
1 1 1 0 0 0 1 0 1 1 0 1
A= 1 (0.6.3)
1 0 0 0 1 0 1 1 0 1 1
1 0 0 0 1 0 1 1 0 1 1 1
1 0 0 1 0 1 1 0 1 1 1 0
1 0 1 0 1 1 0 1 1 1 0 0
1 1 0 1 1 0 1 1 1 0 0 0
1 0 1 1 0 1 1 1 0 0 0 1
In addition, the 12 by 24 matrix G' = [^ 17,2] is also a generator for the code.
This [24, 12] extended Golay code C24 has minimum distance ^„i„ =8 and has
Property 1: The extended binary Golay code C24 is a doubly-even code, i.e.
37
Introduction
Property 3: Unlike the [23,12] code, the [24, 12] extended Golay code is not
perfect, only quasi-perfect, because all spheres of radius t are disjoint, but
perfect code is defined to be a code which for some t has most vectors of
Property 4: There are 212, or 4096, possible code words in the extended
Golay code and like the non-extended [23,12] code, it can be used to correct at
The possible weights of codewords in [24, 12] extended Golay code C^^ are 0,
C24 is a linear code. But it can be prove that C^^ has neither codewords of
weight 4 nor of weight 20 [MacWilliams and Sloane (1977)]. Thus the weights
each left sideL of a codeword in C^^ there are two possible right sides, R
and /{. If M'r(I) = 0, then wt{R) ^ 4 and wt{R) ^ 8 (or else wt(R) = 4 ), which is
38
Introduction
+ 8 8 12
v3y v4y
a =? 6 2 10 8 16
6 6 6 12 12
rin rir 8 8 12 16
10 16 16
.9 J"'llO,
12 12 12 24
^ = 4 11+
'in^ ^ir 'in = 759 (0.6.4)
.2. .3.
and so A12 = 2576 .Hence the weight distribution of extended binary Golay code C24 is
/ : 0 8 12 16 24
(0.6.5)
^•: 1 759 2576 759 1
Any binary vector of weight 5 and length 24 is covered by exactly one codeword of
39
IntroductioB
(%\ (1^\
and 759 . Thus the codewords of weight 8 in the extended binary
v5y
Golay code C24 form a Steiner system S(5, 8, 24). The codewords of weight 8
Witt (1938) proved an important theorem which states that the Steiner system
S(5, 8, 24) is unique. Other proofs of the theorem are given by Curtis (1976),
Jonsson (1972). Since there is a generator matrix for C^^ all whose rows have
weight 8, it follows that the octads of S(5, 8, 24) generate C24. This result is
used to prove an important property of extended Golay code that C-^^ is unique
in terms of its parameters by Delsarte and Goethals (1975) and Pless (1968).
Since the [23, 12, 7] perfect code C23, may be obtained by deleting any
Property 7: The Golay code C^^ is a self-dual code i.e. C24 = C^.
It is well known that the length of any binary doubly-even self-dual code is divisible
by eight. Mallows and Sloane (1973) proved the following celebrated bound on the
d<4 — +4 (0.6.6)
24 ^ ^
40
Introduction
Property 8: The block intersection number /i,^ for the Steiner system formed by
0 759
1 506 253
2 330 176 77
3 210 120 56 21
iiA 130 80 40 16 5
5 78 52 28 12 4 1
6 46 32 20 8 4 0 1
7 30 16 16 4 4 0 0 1
8 3 0 0 16 0 4 0 0 1 1
Figure 0.6.1 Block intersection number Xy for the octads in the extended Golay code
0 2576
1 1288 1288
2 616 672 616
3 280 336 336 280
iiA 120 160 176 160 120
5 48 72 88 88 72 48
6 16 32 40 43 40 32 16
7 0 16 16 21 24 16 16 0
8 0 0 16 0 24 0 16 0 0
Figure 0.6.2 Generalized block intersection number /{,y for the octads in the extended
Golay code.
41
Introduction
Conway (1971) shown that the Mathieu group M24 preserves the extended binary
Golay code C24 i.e. M24 is the full automorphism group of C^^ .The Mathieu
In addition to the binary Golay codes discussed previously, there are also [11,
6, 5] and [12, 6, 6] ternary Golay codes, denoted by C^ and C,2. The ternary
[11, 6, 5] Golay code is the only known perfect nonbinary code. Note that a
Hamming sphere with radius 2 over GF(3) contains 243 vectors because
1+2 +4
^ir
= 243
.1. .2,
)5 ^1 1 r. . ^ 1 . u\, o6
Since 243 is equal to 3 , there may be a perfect packing with 3 spheres (code
words) of radius t = 2, which was discovered by Golay. The [11,6] Golay code
over the Galois field with three elements GF(3) has minimum distance of 5,
and can correct up to two errors. As stated previously, like the [23, 12, 7]
binary Golay code, the [11, 6, 5] ternary Golay Code has the largest minimum
distance d^^^ of any known code with the same values of n and k.
Like [23, 12, 7] binary Golay code the [11, 6, 5] ternary Golay code Q, may
x''-\ = {x-\)g,{x)g,{x)
42
Introduction
Note that g\{x) = -x^g2{x ) and so (gi(x))and (giW) are equivalent [11, 6]
Also, like the [23, 12, 7] binary code, the [11, 6, 5] ternary code can be
ternary Golay codeC]2. The generator matrix of C12 can be given as:
0 1 1 1 1 1
1 0 10 12 2 1
1 1 1 0 12 2
G-{h\A] = (0.6.7)
1 12 10 12
0 1 12 2 10 1
1 1 1 2 2 10
The [11, 6, 5] ternary code Cj, is a perfect code as it attains the sphere packmg
rin ..(\\\
bound i.e. 3^- 1 + 2' +2^ = 3''. The supports of the codewords of
Kh v2y
weight 5 of Cn form the blocks of Steiner system S(4, 5, 11). The [12, 6, 6]
code C12 is self dual, so has all weights divisible by 3. The supports of the
codewords of weight 6 form the 132 blocks of Steiner system S{5, 6, 12).
Delsarte and Goethals (1975) and Pless (1968) proved that C,j, Steiner system
S(4, 5, 11) and C12, S(5, 6, 12) are unique. The automorphism group of C12 is
43
Introduction
0.7.1 Definitions
Euclidean N-space R^ that forms a group under ordinary vector addition, i.e.,
dimensions; however, this will not be the case for any lattice considered here,
real lattice, up to scaling, and the prototype of all lattices. The set Z^ of all
lattice is a group; this property leads to the study of subgroups (sublattices) and
such as the Euclidean distance metric and the notion of volume in R^.The
following two sections are concerned with these two aspects of lattice structure.
44
Introduction
of A.
3) Cartesian Product: The M-fold Cartesian product of A with itself- i.e., the
set of all MN-tuples {l[,Xi, ...,Xj^) where each Xj, is in A-is an MN-
For example, Z^ is the N-fold Cartesian product of Z with itself, and rZ^ is a
0 • o •
• 0 * 0
o • o •
• o • o
R= (0.7.1)
1 -I
45
Introduction
and is also illustrated in Fig. (0.7.1). The points in RZ^ are a subset of the
points inZ^, meaning that RZ^ is a sublattice of Z^. Note that R^ =21, where
I is the identity operator (in two dimensions), so that i?^Z^ We can define a
^ 1 1 0 0^
1-10 0
/? = (0.7.2)
0 0 11
0 0 1-1
Note that R =21 for any N, where I is the identity operator in 2N dimensions,
difference is a point in A. Thus the coset A+c is the set of all points equivalent to
c modulo A.
46
Introduction
equivalence classes modulo A' (the equivalence classes may be added modulo
A' and form the quotient group A/A'). We shall say that the order of the
partition (or quotient group) A / A' is the number | A / A' | of such equivalence
classes (in the mathematical literature, | A / A' | is usually called the index of A'
in A). Each equivalence class is a coset of A' (one being A'itself), or,
orderl Z^/i?Z^ |=2, and Fig. 0.7.1 illustrates Z^ as the union of two cosets of
If we take one element from each equivalence class, we obtain a system of coset
representatives for the partition A/A', denoted by [A/A']. (In general, there are
many ways of selecting such a system[A/A'], so the notation does not entirely
specify the system.) Then every element of A can be written uniquely as a sum
A = A'+[A/A'] (0.7.3)
For example, the two 2-tuples (0, 0) and (1, 0) are a system of coset
written as the sum of one of these two 2-tuples with an element of RZ^, i.e.,
Z^ is the union of RZ'^+ {0,0) = RZ'^ and i?Z^ +(1,0) (the black dots and
47
Introduction
m equivalence classes modulo mZ (modulo m), and the order of the partition is
m. The integers {0,1,... ,w -1} form a system of coset representatives for the
Z/2Z has order 2 and divides the integers into two subsets, 2Z (the even
More generally, for any m e Z, the lattice mZ^ of N-tuples of integer multiples
A = A'+[A/A']+c (0.7.4)
sublattice of the previous one (in other words, A 3 A' 3 A"...). For example,
A = A''+[A7A'']+[A/A'] (0.7.5)
48
Introduction
an integer m:
where 00,0^,02,... e {0,1}, and QQ specifies the coset in the partition Z/2Z, 2a^
specifies the coset in the partition 2Z/4Z, and so forth. That is,
Z = [Z/2Z]+[2ZIAZ]+[4Z+8Z] + (0.7.7)
The geometry of a real lattice A arises from the geometry of a real Euclidean
N-space R^. The two principal geometrical parameters of A are the minimum
The norm |jf|| of a vector x in R is the sum of the squares of its coordinates.
Norms are nonnegative and in fact nonzero unless x = O.The squared distance
Because a lattice A consists of discrete points, the norms of all lattice points
are an infinite set of discrete values that can be enumerated in ascending order.
We call this the weight distribution of the lattice (theta series, in the lattice
49
Introduction
between any point in the lattice and all other points, since any point A, in A
The minimum nonzero norm is thus the minimum squared distance <imin(^)
between any two points in A, The number of elements of A with this norm is
the number of nearest neighbors of any lattice point (also called the kissing
example, for any N, the integer lattice Z^ has d^;^^{Z^) = \. The set of all
integer N-tuples of norm 1 is the set of all permutations and sign changes of the
Loosely, the fundamental volume F(A) is the volume of N-space per lattice
point, or the reciprocal of the number of lattice points per unit volume. More
associated with each lattice point, then F(A) is the volume of each such region.
For example, it is easy to see that we may partition N-space into N-cubes of
To treat the general case, note that R^ is itself a group under ordinary vector
addition (but not a lattice), because its points are not discrete. Any real N-
_
Introduction
and only one point from each such equivalence class modulo A; thus M(A) is a
fundamental region, every fundamental region K(A) must have the same
volume V{K).
From the two geometrical parameters fi?Jin(A) and V(A) we define the
yW=''^^^^,M (0.7.8)
(in the mathematical literature this is called Hermite's parameter and is also
Again, we stipulate that the only such lattices to be considered here will
real lattice A^, and a corresponding N-dimensional complex lattice A^, formed by
taking each pair of coordinates of A^ to specify the real and imaginary parts of
each coordinate of A^, or vice versa. Addition of two points gives the same result
in either case. Sublattices, cosets, and all such group properties cany over. Even
the norm of two corresponding vectors is the same, so distances are not affected.
real or complect. For all parameters previously defined (e.g., ^JinC^)' ^(^)' Y(^) )»
we may define the values for a complex lattice to be the same as those for tiie
Aj., may be scaled by either a real number r or a complex number a , the latter
52
Introduction
inner product (x, y) of two real vectors x and y is the sum of the products of
their coordinates must be real; the (Hermitian) inner product (x, y) of two
complex vectors x and y is the sum of the products of the coordinates of x with
lattice G corresponding to the two dimensional real lattice Z^. The point (a, b)
integers. The set G is called the set of Gaussian integers. The Gaussian integers
yields another element of G, which cannot be 0 unless one of the two elements
is 0 (in fact, their norms multiply as real integers). Thus G is a ring and, in fact,
units (invertible elements) are ±1 and ±i, and the primes are the elements that
of increasing norm, are 1±/, 2±i, 3:[.y, with norms 2, 5, 9 , . . . . We denote the
prime of least norm by (j) = 1 + / (Note that (j)^ = ((MJ)* = 2, and thus two is not a
gG of G. The partition G/gG must have order | ^ p (the norm of g). There are
53
Introduction
linear and cyclic block codes that can be considered as a generalization of the
Hamming codes, as they can be designed for any value of the error-correction
capability t. These codes are defined in the binary field GF(2), and also in their
non-binary version, over the Galois field GF(q). These codes were generalized
BCH codes are a generalization of Hamming codes, and they can be designed
to be able to correct any entor pattern of size t or less [Bose, Chaudhuri (1960),
(Hamming codes) to codes for any desired higher value oft (BCH codes). The
For any positive integer nfi > 3 and / < 2*""', there exists a binary BCH code
These codes are able to correct any error pattern of size t or less, in a code
54
Introductioa
It also true that g(X) has a' and its conjugate as its roots. On the other hand, if
However, and due to repetition of conjugate roots, it can be shown that the
generator polynomial g(X) can be formed with only the odd index minimal
Since the degree of each minimal polynomial is m or less, the degree of g(X) is
at mostw^ As BCH codes are cyclic codes, this means that the value of n - k
can be at most mt. The Hamming codes are a particular class of BCH codes, for
which the generator polynomial is g(X) = ^i(X). A BCH code for t = 1 is then
polynomial of degree m.
55
Introduction
(t)5(X) = l + X + X2
A BCH code for correcting error patterns of size t = 2 or less, and with block
g{X) = LCHMX)MX)}
g(X) = (l)i(^)(t)3(X)
g{X) = {UX + x'^){\ + X+x'^+X^+X^)
g(,X) = {\ + X^ + X^+x'^ +X^)
This is the BCH Code CQCU{\5, 7) with minimum Hamming distance d^^^> 5.
distance d^^i^ > 7, which can be constructed using the generator polynomial
g(X) = ^i(X)^^iX)^,(X)
= (l + X+X'^)(\-\-X+X^+X^+X^)i\ + X+X^)
= (1 + X + X^+X'^+X^+X^^ X^°)
56
Introduction
As a result of the defmition of a linear binary block BCH code C^CH (^» ^ fo""
correcting error patterns of size t or less, and with code length n = 2" - 1 , it is
possible to afFiirm that any code polynomial of such a code will have
c{X) = CQ + c^X+CiX^ + ... + Cn_^X"~^ of Qc// (n, k) has primitive element a' as
a root:
In matrix form,
a
2/
(CQ, CJ, C2, —,C„-\)°
a =0 (0.8.5)
(n-\)i
a
The inner product of the code vector (CQ, CJ, C2, ...,c„_j)and the vector of roots
57
Introduction
1 a a" a? . •• a"-^
1 a^ i^'f (a^)^ • •• (a^r^
H = 1 a^ i^'f (a^)^ • •• (a^r' (0.8.6)
coW =0 (0.8.7)
From this point of view, the linear binary block BCH code C^^ffin, k) is the
dual row space of the matrix H, and this matrix is in turn its parity check
matrix. If for soi^e i and some j , a-' is the conjugate of a', thenc(a^) = 0. This
means that the ipner product of C = (CQ, C,, CJ, ...,c„_i) with the ith row of H is
zero, so that these rows can be omitted in the construction of the matrix H,
a a a
3
a (a')"-'
H = 1 a= (0.8.8)
58
Introduction
4 1 _
Example: For the binary BCH code Q^/zClS, 7) of length « = 2* -1 = 15, able
to correct any error pattern of size t = 2 or less, and a being a primitive element
\ 2 3 4 5 6 7 8 9 10 11 12 13 14"
H= l a a a a a a a a a a a a a a
,l a3 6a 9a a 12 a 0a 3 a 6 a 9 a 12 0 a 3a 6a a9 a12
1 0 0 0 1 0 0 1 1 0 1 0 1 1 1
0 1 0 0 1 1 0 1 0 1 1 1 1 0 0
0 0 1 0 0 1 1 0 1 0 1 1 1 1 0
0 0 0 1 0 0 1 1 0 1 0 1 1 1 1
H=
l O O O l l O O O l l O O O O
o o o i i o o o i i o o o i i
0 0 1 0 1 0 0 1 0 1 0 0 1 0 1
0 1 I 1 1 0 1 1 1 1 0 1 1 1 1
distance i/^jn > 2^ + 1, so that its corresponding parity check matrix H has 2t + 1
columns that sum to the zero vector. BCH codes are linear block codes, and so
weight. Should there exist a non-zero code vector of weight p^j <2t with non-
59
Introduction
i<^ju^j2^<^j3^-'Cjp,y
a =0 (0.8.9)
(^jbCy2.^y3.-Cyp^)' =0 (0.8.10)
JPH (nJPH\^ -
iaJP"f (a^P"/"
a
which becomes a p^ x/?^ matrix that fits the result indicated in equation (10)
=0 (0.8.11)
1 a^' - {aJ^fPf-^^
Ui+j2+...+Jpff)
a =0 (0.8.12)
determinant [Lin and Costello (1983), Blaum (2001)]. Thus, the initial
assumption that pff j< 2/ is not valid, and the minimum Hamming distance of a
60
Introduction
the designed distance of a BCH code, but the actual minimum distance can be higher.
Binary BCH codes can also be designed with block lengths less than 2*" - 1 , in a
similar way to tjiat described for BCH codes of length equal to 2" - 1 . If p is an
Then
!
\
g(X) = LCMMX)M{X\... MtiX)} (0.8.13)
I
Since p" =1, p,0^, ....,p^'are roots of Z" +1. Therefore the cyclic code generated
by g(X) is a code of code length n. It can be shown, in the same way as for binary
BCH codes of code lengths = 2*" - 1 , that the number of parity check bits is not
greater than mt, and that the minimum Hamming distance is atleast Jn,i„ > 2/ +1.
The above analysis provides us a more general definition of a binary BCH code
integer, then the binary BCH code with a designed mmimum distance dQ is
61
Introduction
Here, (|),(A')is the minimal polynomial of P"**"^' and «, is its order. The
n = ICM{«i,«2—«V2} (0.8.15)
The designed binary BCH code has minimum distance C/Q, a maximum number
W(JQ -l)of parity check bits, and is able to correct any error pattern of size
L(^o-l)/2j.
code length of the binary BCH code is « = 2"" - 1 . In this case the binary BCH
primitive element of GF(2'")then the code length of the binary BCH code is
not n = 2"" - 1 , but is equal to the order of p. In this case the binary BCH code is
roots of the generator polynomial g(X) ensures that the binary BCH code has a
In more general way BCH codes can be defined as [Ling and Xing (2004)]
polynomial of a' with respect to GF{q). A (primitive) BCH code over GF{q)
62
Introduction
= YliX^a') (0.8.17)
Hence, the dimension is equal to q'" -1 -deg(g(Z)) = ^'" - 1 - | 51. As the set S is
k = q'"-l-\S
a+8-2
k=q'"-l-
i=a
•6-2
^>^'"-l- Y\Ci\
k= q'"-\-m{h-\) (0.8.18)
63
Introduction
The above result shows that, in order to find the dimension of a q-ary BCH
a+5-2
j^^a+5-2^^-jj, it is sufficient to check the cardinality of | J C,, where C, is the
i=a
C2 = {1,2^4,8} , C3 = {3,6,12,9}
Then the dimension of the binary BCH code of length 15 of designed distance
Example: For t > 1, t and 2t belong to the same cyclotomic coset of 2 modulo
{M^^\X)M^\X),....,M^^'~^\X)} = {M^^kX),M^^\X),....,M^^'\X)}',
i.e., the narrow^sense binary BCH codes of length 2m-1 with designed
distance 2t + 1 are the same as the narrow-sense binary BCH codes of length
In the given Tabid we list the dimensions of narrow-sense binary BCH codes of
length 2" - 1 with designed distance 2t + 1, for 3 < m < 6. Note that the dimension
64
Introduction
f
Fable 0.8.1
n k t « k t
7 4 1 63 51 2
15 11 1 63 45 3
15 7 2 63 39 4
15 5 3 63 36 5
31 26 1 63 30 6
31 21 2 63 24 7
31 16 3 63 18 10
31 11 5 63 16 11
31 6 7 63 10 13
63 57 1 63 7 15
(i) A narrowi-sense q-ary BCH code of length n = q'^ -Iwith designed distance
5 has dimension exactly q"^ -\-m{b-\) \i qi^l and gcd(g'"-l,e) = lfor all
l<e<6-L
From the dimension of q-ary BCH code, we know that the dimension is equal
a+5-2
to q"'-\-
/=^a
Hence, it is sufficient to prove that |C,| = m for all 1 < i < 8 - 1, and that C,
65
Introduction
For any integer 1 < t < m - 1, we claim that / # q'i{ra.odiq"* -l)for 1 < i < 8 - 1.
For any integers 1 < i < j < 5-1, we claim that ; # qH (mod^'"-l)for any
integer s > 0. Otherwise, we would havey-/s {q^ -i)/ (mod^'" -1). This forces
is the same as a narrow-sense binary BCH code with designed distance 2t +1, it
distance.
2/
k = 2'"-l-
1=1
66
Introduction
^>2'"-1-2|C2M|
k>2'"-Utm
ik = 2'"-l^w(6-l)/2.
In this subsection We describe that many interesting and important codes will arise
Let C be an [n, k, d] code over F^. We can puncture C by deleting the same
coordinate /in each codeword. The resulting code is still linear, its length is n -
omitting a zero or duplicate row that may occur). What are the dimension and
minimum weight of C* ? Because C contains q'' codewords, the only way that
67
Introduction
coordinate.
Property 0.9*1 Let C be an [n, k, d] code over F^, and let C* be the code C
Example 0.9.1 Let C be the [5, 2, 2] binary code with generator matrix
0 0
G=
1 1
1 0 0 0' 'l 1 0 0
G; = and G: =
0 1 1 1 [o 0 1 1
We can create longer codes by adding a coordinate. There are many possible
ways to extend a code but the most common is to choose the extension so that
68
Introduction
the new code has only even-like vectors. If C is an [n, k, d] code over F^,
parity check matrices, respectively, for C. Then a generator matrix Gfor Ccan
H=
H
Consider the set C{T) of codewords which are 0 on T; this set is a subcode of
coordinates. Then:
69
Introduction
respectively;
respectively.
For i E {1, 2} let C, be an [np^,,^/,] code, both over the same finite field F^.
Then their direct sum is the [ Wj + Wj, ^i + ^2 > min(c?,, Jj) ] code
q®C,={(c^,c,)\c,eq,c,eC,}
Two codes of the same length can be combined to form a third code of twice
the length in a way similar to the direct sum construction. Let C, be an [n,^,,£/.]
code for / £ | 1 , 2}, both over the same finite field F^. The (u | u + v)
70
Introduction
C = {(u|u + v)|ueC,,veC2}
If C, has generator matrix G, and parity check matrix H,, then generator and
(G, G,\ ( H, 0^
and
^0 G,j
construction can produce codes that are important for reasons other than
theoretical.
71