0% found this document useful (0 votes)
136 views103 pages

Group Theory & Error Detecting / Correcting Codes: Article

This document discusses error detecting and correcting codes. It provides background on group theory, fields, vector spaces, and matrices which are mathematical concepts used in error control coding. It then describes linear block codes, focusing on generator and parity-check matrices, minimum distance, and decoding techniques like the standard array and syndrome decoding. Specific code families like Hamming codes, Golay codes, and Reed-Muller codes are also covered.

Uploaded by

Konj Konjevic
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
136 views103 pages

Group Theory & Error Detecting / Correcting Codes: Article

This document discusses error detecting and correcting codes. It provides background on group theory, fields, vector spaces, and matrices which are mathematical concepts used in error control coding. It then describes linear block codes, focusing on generator and parity-check matrices, minimum distance, and decoding techniques like the standard array and syndrome decoding. Specific code families like Hamming codes, Golay codes, and Reed-Muller codes are also covered.

Uploaded by

Konj Konjevic
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 103

See

discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/237564884

Group Theory & Error Detecting / Correcting


Codes

Article

CITATION READS

1 114

1 author:

Sotiris Moschoyiannis
University of Surrey
49 PUBLICATIONS 288 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Web Transactions View project

Intelligent Routing of Electric Vehicles View project

All content following this page was uploaded by Sotiris Moschoyiannis on 14 April 2014.

The user has requested enhancement of the downloaded file.


University of Surrey
Department of Computing
School of Electronics Computing and Mathematics

Group Theory & Error Detecting / Correcting Codes

Sotiris Moschoyiannis

Technical Report SCOMP-TC-02-01

December 2001
Group Theory

and

Error Detecting / Correcting Codes

S. K. MOSCHOYIANNIS

Submitted for the Degree of


Master of Science in Information Systems
from the University of Surrey

Department of Computing
School of Electronics, Computing & Mathematics
University of Surrey
Guildford, Surrey GU2 7XH, UK

September 2001

Supervised by: Dr M. W. SHIELDS

S. K. Moschoyiannis 2001

ii
iii
ABSTRACT

At the dawn of the 21st century it is more than obvious that the information age is upon us. Technological
developments such as orbiting and geo-stationary satellites, deep-space telescopes, high-speed computers,
compact disks, digital versatile disks, high-definition television and international networks allow massive
amounts of information to be transmitted, stored and retrieved. Vocationally, efficient communication of
information, in terms of speed, economy, accuracy and reliability, is becoming an essential process. Since
its orig ins, the field of error detecting / correcting codes arose in response to practical problems in the
reliable communication of digital information.

Natural communication systems such as our eyes or the English language use mechanisms to achieve
reliability. Our eyes, when we are disoriented, use experience to guess the meaning of what they see,
heavily depending on the various independent guessing mechanisms of our brains. The English language
makes use of built-in restrictions to ensure that most sequences of letters do not form words, so as to allow
very few candidates for the correct version of a misspelled word.

However, when processing digital information, the lack of assumptions about the original message provides
no statistic upon which to base a reasonable guess. Therefore, robust communication systems employ error
detecting / correcting codes to combat the noise in transmission and storage systems. These codes obtain
error control capability by adding redundant digits in a systematic fashion to the original message so that
the receiver terminal can reproduce the message if altered during transmission. In order to ensure that
redundancy will allow for error detection / correction, various mathematical methods are invoked in the
design of error control codes.

This study aims to indicate the algebraic techniques applied for developing error detecting / correcting
codes. Many of these techniques are based on standard abstract algebra, particularly the theory of groups
and other algebraic structures such as rings, vector spaces and matrices on the basics of finite fields.

These mathematical concepts are discussed, focusing on their relation to error control coding, as they
provide the basis for constructing efficient error detecting / correcting codes.

iv
ACKNOWLEDGEMENTS

I would like to thank my supervisor Dr. M. W. Shields for his guidance until completion of this project. He
made me feel I had the appropriate degree of freedom in conducting this study and at the same time his
valuable comments and suggestions at each phase, gave me direction towards the next stage.

I would also like to thank my family for their continuous encouragement throughout this study. Their
financial support enabled me to concentrate solely on this project.

Many thanks to my friend Mary for her constant support and assistance. Her printer facilitated numerous
corrections on pre-drafts of my work.

v
CONTENTS

Abstract......................................................................................................................................iv

Contents......................................................................................................................................vi

1. Introduction............................................................................................................................1

2. Principles of Error Control.....................................................................................................3


2.1 Basic Binary Codes........................................................................................................4
2.1.1 Repetition Codes.................................................................................................4
2.1.2 Single – Parity-Check Codes..............................................................................4
2.1.3 Observation – Comparison.................................................................................4
2.2 Shannon’s Theorem......................................................................................................5
2.2.1 Observation.........................................................................................................5
2.2.2 Maximum Likelihood Decoding.........................................................................6

3. Mathematical Background I...................................................................................................7


3.1 Groups............................................................................................................................7
3.2 Fields..............................................................................................................................8
3.3 Vector Spaces.................................................................................................................9
3.4 Matrices........................................................................................................................12

4. Linear Block Codes..............................................................................................................13


4.1 Block Coding................................................................................................................13
4.1.1 Vector Representation........................................................................................14
4.2 Definition of Linear Block Codes.................................................................................14
4.3 Matrix Description........................................................................................................14
4.3.1 Generator Matrix...............................................................................................15
4.3.2 Parity-Check Matrix..........................................................................................16
4.3.3 G, H in Systematic Form...................................................................................16
4.4 Minimum Distance.......................................................................................................18
4.4.1 Definition..........................................................................................................18
4.4.2 Minimum Distance and Error Detection / Correction.......................................19
4.5 Error Processing for Linear Codes...............................................................................21
4.5.1 Standard Array...................................................................................................22
4.5.1.1 Limitations of the Standard Array Decoding.......................................23

vi
4.5.2 Syndromes of Words.........................................................................................24
4.5.2.1 Observation........................................................................................26

5. Hamming Codes – Golay Codes – RM Codes.....................................................................27


5.1 Hamming Codes..........................................................................................................27
5.1.1 Programming the [7,4] Hamming Code..........................................................28
5.1.1.1 Perl Program for Systematic Encoding..............................................29
5.1.1.2 Perl Program for Syndrome Decoding...............................................31
5.2 Golay Codes.................................................................................................................35
5.3 Reed-Muller Codes......................................................................................................36
5.3.1 Definition..........................................................................................................36
5.3.2 Properties of RM Codes....................................................................................37
5.4 Hadamard Codes..........................................................................................................37

6. Mathematical Background II...............................................................................................39


6.1 Structure of Finite Fields.............................................................................................39
6.1.1 Basic Properties of Finite Fields.......................................................................39
6.1.2 Primitive Polynomials.......................................................................................42
6.1.3 Finite Fields of Order p m...................................................................................44
6.2 Polynomials over Galois Fields..................................................................................48
6.2.1 Euclid’s Algorithm...........................................................................................48
6.2.2 Minimal Polynomials.......................................................................................50
6.2.3 Factorisation of xn – 1.......................................................................................51
6.2.4 Ideals................................................................................................................52

7. Cyclic Codes........................................................................................................................54
7.1 Polynomial Representation.........................................................................................54
7.2 Cyclic Codes as Ideals................................................................................................55
7.3 Parity-Check Polynomial – Generator Matrix for Cyclic Codes................................56
7.4 Systematic Encoding for Cyclic Codes.......................................................................59

8. BCH Codes – Reed-Solomon Codes...................................................................................61


8.1 BCH Codes.................................................................................................................61
8.1.1 Parity-Check Matrix for BCH Codes...............................................................62
8.2 Reed-Solomon Codes..................................................................................................63
8.3 Decoding Non-binary BCH and Reed-Solomon Codes..............................................66
8.4 Burst Error Correction and Reed-Solomon Codes......................................................69

vii
9. Performance of Error Detecting / Correcting Codes...........................................................71
9.1 Error Detection Performance......................................................................................71
9.2 Error Correction Performance.....................................................................................72
9.3 Information Rate and Error Control Capacity.............................................................73

10. Error Control Strategies and Applications........................................................................76


10.1 Error Control Strategies...........................................................................................76
10.2 Error Control Applications......................................................................................77

Afterword..................................................................................................................................79

Appendix A...............................................................................................................................81

Appendix B...............................................................................................................................86

Appendix C...............................................................................................................................87

Appendix D...............................................................................................................................89

Appendix E...............................................................................................................................90

Bibliography.............................................................................................................................91

viii
Group Theory and Error Detecting / Correcting Codes
Introduction

1. INTRODUCTION

Any communication system is engaged in a design trade-off between transmitted power,


bandwidth and data reliability. Error detecting / correcting codes address the issue of data
reliability. In addition, by reducing the undesirable effects of the noisy channel, error control
codes are also interrelated with the required transmitted power and bandwidth. Therefore,
error detecting / correcting codes are central to the realisation of efficient communication
systems.

In some cases, analysis of the design criteria for a communication system may have once
indicated that the desired system is a physical impossibility. Shannon and Hamming laid the
foundation for error control coding, a field which now includes powerful techniques for
achieving reliable reproduction of data that is transmitted in a noisy environment. Shannon’s
existential approach motivated the search for codes by providing the limits for ideal error
control coding while Hamming constructed the first error detecting / correcting code.

The purpose of error detecting / correcting codes is to reduce the chance of receiving
messages which differ from the original message. The main concept behind error control
coding is redundancy. That is, adding further symbols to the original message that do not add
information but serve as check / control symbols. Error detecting / correcting codes insert
redundancy into the message, at the transmitter’s end, in a systematic, analytic manner in
order to enable reconstruction of the original message, at the receiver’s end, if it has been
distorted during transmission.

The ultimate objective is to ensure that the message and its redundancy are interrelated by
some set of algebraic equations. In case the message is disturbed during transmission, it is
reproduced at the receiver terminal by the use of these equations. Explicitly, error control
efficiency is highly associated with applying mathematical theory in the design of error
control schemes. The purpose of this study is to indicate the underlying mathematical
structure of error detecting / correcting codes.

The Chapters are organised as follows.

Chapter 2 highlights the main principles of error detecting / correcting codes. Two examples
of basic binary codes are included and the Chapter concludes with a discussion on Shannon’s
Theorem.

1
Introduction

Concepts of elementary linear algebra, are introduced in Chapter 3. In particular, the theory
of groups and related algebraic structures such as fields, vector spaces and matrices are
selectively presented.

Chapter 4 covers the basic structure of linear block codes. Based on the vector representation,
these codes are postulated in terms of the mathematical entities introduced in Chapter 3.
Additionally, an extensive section is devoted to the decoding process of linear block codes.

The well-known Hamming codes are presented in Chapter 5, among with other classes of
linear block codes such as Golay, Hadamard and Reed-Muller codes. The construction of
these codes rests on the concepts introduced in Chapters 3 and 4. In this part, we have
included two programs that perform encoding and decoding for the [7,4] Hamming code.

Chapter 6 delves into the structure of Galois fields, aiming to form the necessary
mathematical framework for defining several powerful classes of codes. The main interest is
in the algebraic properties of polynomials over Galois fields.

Chapter 7 proceeds to develop the structure and properties of cyclic codes. The polynomial
representation of a code is used as a link between error detecting / correcting codes and the
mathematical entities introduced in Chapter 6. Emphasis is placed on the application of these
mathematical tools to the construction of cyclic codes.

Chapter 8 is devoted to the presentation of the important classes of BCH and Reed-Solomon
codes for multiple error correction. Attention is confined to their powerful algebraic decoding
algorithm. The last section describes the capability of Reed-Solomon codes to correct error
bursts.

A discussion on basic performance parameters for error detecting / correcting codes is


included in Chapter 9 for both binary and non-binary codes.

Chapter 10 investigates suitable error control strategies for specific applications. Reed-
Solomon codes are emphasised due to their popularity in current error control systems.

2
Principles of Error Control

2. PRINCIPLES OF ERROR CONTROL

Error detecting / correcting codes are implemented in almost every electronic device which
entails transmission of information, whether this information is transmitted across a
communication channel or stored and retrieved from a storage system such as a compact disk.
The set of symbols – this set always being finite – used to form the information message,
constitutes the alphabet of the code. In order to send information, a channel, be it physical or
not, is required. In most of the cases presented, a random symmetric channel is considered.
A channel is a random symmetric error channel if for each pair of distinct symbols a, b of the
alphabet, there is a fixed probability p a,b that when a is transmitted, b is received and p a,b is the
same for all possible pairs a, b (a ? b).

The basic operating scheme of error detecting / correcting codes is depicted in Figure 2–1.
Suppose that an information sequence of k message symbols or information digits is to be
transmitted. This sequence m may be referred to as message word. The encoder, at the
transmitter’s end,

random error generator


u

m u v
Encoder ? Decoder

retransmission required

Figure 2–1

adds r check digits from the alphabet according to a certain rule, referred to as encoding rule.
The encoder outputs a total sequence u of n digits, called codeword, which is the actual
sequence transmitted. The n – k = r additional digits, known as parity-check digits, are the
redundant digits used at the receiver’s end for detection and correction of errors. Errors that
occur during transmission alter codeword u to a word v, which is the received word. The
decoder checks whether the received word v satisfies the encoding rule or not. If the
condition is determined to be false, then error processing is performed, in an attempt to
reproduce the actual transmitted codeword u. If this attempt fails, the received word is
ignored and retransmission is required, else the decoder extracts the original message m from
the reconstructed codeword u.

3
Principles of Error Control

2.1 Basic Binary Codes


This section reports on two primary examples of codes, employed to transmit an information
sequence of 1s and 0s across the binary symmetric channel. Their simple structure illustrates
the general operating principles of error detecting / correcting codes but also sheds light on
certain defects of the field, addressed in the following chapters.

2.1.1 Repetition Codes


Among the simplest examples of binary codes are the Repetition codes, Berlekamp (1968,
p. 2), in which each bit is repeated r times. For example, we could send each bit three times.
To send 011 we transmit 000111111. If we receive 000111011, an error is detected in the
third bit, a reasonable action would be to change the 0 bit to 1 and the received word would
be correctly suggested to be 011. Repetition codes, in general, are able to detect a double
error and correct a single error. They are uneconomical, as for transmission r times as many
bits as the original message are required. In other words, they have low information rate
k
R= , where k are the information bits and n is the length of the total sequence of bits
n
transmitted.

2.1.2 Single – Parity-Check Codes


The simplest high rate codes beyond the repetition codes are the Single-Parity-Check codes,
Berlekamp (1968, p. 3), which contain only one check digit. This digit is set to be the sum of
the information digits, where addition is made under the binary rules 0 + 0 = 0, 0 + 1 = 1 + 0 = 1
and 1 + 1 = 0. Thus, the encoding condition is that the total number of 1s, including the check
digit, is even in every codeword. In that way, the received word is checked for the number of
1s; if it is even then the codeword is decoded without change, but if it is odd then an error has
occurred. The weakness of the single -parity-check codes is that retransmission is required,
since there is no way to correct the error. Further, if any even number of errors occur, the
word will be assumed correct even though it is not. Therefore, these codes have high
information rate R – note that R approaches 1 as n increases – but are unable to detect any
even number of errors and require retransmission for any odd number of errors.

2.1.3 Observation – Comparison


The class of single -parity-check codes attain high rate R, but their achievement is outweighed
by the loss of error correction capacity. On the other hand, the family of repetition codes use
the alphabet inefficiently, low information rate R, but gain in error correcting capability from
this inefficiency.

4
Principles of Error Control

2.2 Shannon’s Theorem


In order to interpolate between these two extreme examples of codes, the focus is placed on
codes that have both, high information rate and moderate error correction capabilities or
equivalently low error probability PrE . Shannon, in 1948, proved the Fundamental Theorem
of Information Theory, considered to be the starting point for coding theory. The capacity G
of a channel, used in Shannon's Theorem, represents the maximum amount of information
which the channel can transmit and is given by the expression

G = 1 – H(p)

where the function H(p), called the binary entropy of the information source, is defined as

H(p) = p·log 2 p + (1 – p)·log2 (1 – p)

for the binary symmetric channel (BSC) which consists of two symbols with the same
probability p of incorrect transmission. The precise statement has as follows, Jones and Jones
(2000, p. 88).

1
Theorem 2-1: Let ? be a binary symmetric channel with p > , s? B has capacity
2
G = 1 – H(p) > 0 and let d,e > 0. Then, for all sufficiently large n, there is a

code C ⊆ Z n2 of rate R satisfying G – e ≤ R < G, such that the nearest


neighbour decoding gives error probability PrE < d.

For simplicity, the statement is for the BSC, but the theorem is valid for all channels. Thus,
by choosing d and e sufficiently sma ll, PrE and R can be made as close as required to 0 and G
respectively. Informally, the theorem states that if long enough codewords are chosen, then
information can be transmitted across a channel ? as accurately as required, at an information
rate as close as desired to the capacity of the channel. Theorem 2-1 motivates the search for
codes whose information rate R approaches 1 while the length of codewords n increases.
Such codes are often characterised as ‘good’ codes.

2.2.1 Observation
Since, a very large value of n is required to achieve R → C and PrE → 0, long enough
codewords are transmitted making encoding and decoding more complex processes.

5
Principles of Error Control

Furthermore, if n is large then the receiver may experience delays until a codeword is
received, resulting in a sudden burst of information, which may be difficult to handle.

2.2.2 Maximum Likelihood Decoding


Shannon’s Theorem proves the existence of good codes, which are likely to have sufficient
inherent structure enabling the design of efficient encoding and decoding algorithms. The
concept of ‘nearest neighbour decoding’ or ‘maximum likelihood decoding’ mentioned in
Theorem 2-1 deals with the process of matching an erroneous received word to the actual
transmitted codeword.

Assuming that a binary codeword of length n is transmitted, the probability of a particular


received word with errors in i positions is p i q n-i where q = 1 – p. Since q > p, the received
word with no errors is more likely than any other. A received word with one error is more
likely than one with two or more errors and so on. It can be seen that the best decision at the
receiver’s end would be to decode a received word into a codeword which differs from the
received word in the fewest positions.

6
Mathematical Background I

3. MATHEMATICAL BACKGROUND I

In an attempt to construct effective codes, as promised by Shannon’s Theorem, much work


has been done exploiting linear algebra as illustrated in the foregoing discussion. Basic
theory of groups, fields, vector spaces and matrices are the main algebraic tools that provide
the platform for the development of linear block codes, described in Chapters 4 and 5.
Elementary notions on these topics are included in Appendix A.

3.1 Groups
A fundamental concept used in the study of error detection / correction codes is the structure
of a group, which underlies other algebraic structures such as fields and rings.

A binary operation operates on two elements of a set at a time, yielding a third (not
necessarily distinct) element. When a binary operation, along with certain rules restricting the
results of the operation is imposed on a set, the resulting structure is a group.

Definition 3-1: A group is a set of elements G with a binary operation ‘·’ defined in such a
way that the following requirements are satisfied:
1. G is closed; that is a·ß is in G whenever a and ß are in G
2. a·(ß·?) = (a·ß)·? for all a,ß,? ∈ G (associative law)
3. There exists e∈ G, such that a·e = e·a = a (identity)
4. For all a ∈ G, there exists a ∈ G such that a·a = a ·a = e
-1 -1 -1
(inverse)

A group G is said to be a commutative or abelian group if it also satisfies:


5. a·ß = ß·a for all a,ß in G

The order of a group is defined to be the cardinality of the group, which is the number of
elements contained in the group. The order of a group is not sufficient to completely specify
the group. Restriction to a particular operation is necessary. Groups with a finite number of
elements are called finite groups.

For example, the set of integers forms an infinite commutative group under integer addition,
but not under integer multiplication, since the latter does not allow for the required
multiplicative inverses.

7
Mathematical Background I

The order of a group element g ∈ G, essentially different from the order of the group, is
defined to be the smallest positive integer r such that g r = e, where e is the identity element of
group G. A simple method for constructing a finite group is based on the application of
modular arithmetic to the set of integers as stated in the next two theorems, Wicker (1995, p.
23-4).

Theorem 3-1: The elements {0, 1, 2, …, m – 1} form a commutative group of order m under
modulo m integer addition for any positive integer m.

As for integer multiplication, m cannot be selected arbitrarily because if the modulus m has
factors other than 1 and m in a given set, the set will have zero divisors. A zero divisor is any
non-zero number a for which there exists non-zero number b such that a·b = 0 modulo m.
Hence, to construct a finite group of order m under multiplication modulo m, the moduli must
be restricted to prime integers.

Theorem 3-2: The elements {1, 2, 3, …, p – 1} form a commutative group of order (p – 1)


under modulo p multiplication if and only if p is a prime integer.

A subset S of G is a subgroup if it exhibits closure and contains all the necessary inverses.
That is, c = a·b -1 ∈ S for all a,b ∈ S. The order of a subgroup is related to the order of the
group according to Langrange’s Theorem, which states that the order of a subgroup is always
a divisor of the order of the group.

Another important algebraic structure in the study of error control codes is the cyclic group,
defined as follows.

Definition 3-2: A group G is said to be a cyclic group if each of its elements is equal to a
power of an element a in G. Then, the group G is determined by <a>.

Element a is called a generating element of <a>. The element a 0 is by convention the unit
element.

3.2 Fields
The concept of a field, particularly a finite field , is of great significance in the theory of error
control codes as will be highlighted throughout this study. A common approach to the
construction of error detecting / correcting codes suggests that the symbols of the alphabet
used, are elements of a finite field.

8
Mathematical Background I

Definition 3-3: A field F is a set of elements with two binary operations ‘+’ and ‘·’ such that:
1. F forms a commutative group under ‘+’ with identity 0
2. F –{0} forms a commutative group under ‘·’ with identity 1
3. The operations ‘+’ and ‘·’ distribute:
a·(ß+?) = a·ß + a·? for all a,ß,?∈ F

For example, the real numbers form an infinite field, as do the rational numbers.

A non-empty subset F´ of a field F is a subfield, if and only if F´ constitutes a field with the
same binary operations of F.

If the set F is finite, then F is called a finite field. Finite fields are often known as Galois
fields, in honour of the French mathematician Evariste Galois who provided the fundamental
results on finite fields.

The order of a finite field F is defined to be the number of elements in the field. It is standard
practice to denote a finite field of order q by GF(q). For example, the binary alphabet B is a
finite field of order 2, denoted by GF(2), under the operations of modulo 2 addition and
multiplication.

Finite fields and their properties are further discussed in Chapter 6 since they are used in most
of the known classes of error detection / correction codes.

3.3 Vector Spaces


The concept of vector space over finite fields, used for defining the codes presented in the
following Chapters, is introduced in this section.

Consider V to be a set of elements called vectors and F a field of elements called scalars. A
vector space V over a field F is defined by introducing two operations in addition to the two
already defined between field elements:
i. Let ‘+’ be a binary additive operation, called additive vector operation, which maps
pairs of vectors v1 ,v2 ∈ V onto vector v = v1 + v2 ∈ V
ii. Let ‘·’ be a binary multiplication operation, called scalar multiplicative operation,
which maps a scalar a ∈ F and a vector v∈ V onto a vector u = a·v∈ V

Now, V forms a vector space over F if the following conditions are satisfied:
1. V forms an additive commutative group

9
Mathematical Background I

2. For any element a ∈ F and v∈ V, a·v = u ∈ V


3. The operations ‘+’ and ‘·’ distribute:
a·(u + v) = a·u + a·v
(a + b)·v = a·v + b·v
4. For all a,b∈ F and all v ∈ V:
(a·b)·v = a·(b·v) (associative law)
5. The multiplicative identity 1 in F acts as a multiplicative identity in scalar multiplication:
1·v = v for all v ∈ V

Let u,v ∈ V where v = (v0 , v1 , …, vn-1) and u = (u0 , u 1 , …, u n-1 ) with {vi }∈ F and {u i }∈ F.
Then, vector addition can be defined as

v + u = (v0 + u 0 , v1 + u1 , …, vn-1 + u n-1 )

and scalar multiplication can be defined as

a·v = (a·v0 , a·v1 , …, a·vn-1 ) for a ∈ F, v ∈ V

For example, the set of binary n-tuples, Vn , forms a vector space over GF(2), with coordinate-
wise addition and scalar multiplication. Note that the operations for the coordinates are
performed under the restrictions imposed on the set they are taken from. Obviously, Vn has
cardinality 2n , since that is the number of all possible distinct sequences of 1s and 0s of length
n. Since V forms an additive commutative group, for a 0 , a 1 , …, a n-1 scalars in F, the linear
combination v = a 0 v0 + a 1v1 +…+ a n-1 vn-1 is a vector in V.

A spanning set for V is a set of vectors G = {v0, v1 , …, vn-1 }, the linear combinations of which
include all vectors in a vector space V. Equivalently, we can say that G spans V. A spanning
set with minima l cardinality is called a basis for V. If a basis for V has k elements, hence its
cardinality is k, then the vector space V is said to have dimension k. Furthermore, according
to the following theorem, Wicker (1995, p. 31) each vector in V can be written as a linear
combination of the basis elements for some collection of scalars {a i } in F.

Theorem 3-3: Let {vi } i = 0..k – 1 be a basis for a vector space V. For every vector in V
there is a representation v = a 0v0 + … + a k-1vk-1 . This representation is unique.

The notion of vector subspace, defined below, is fundamental to coding theory.

10
Mathematical Background I

Definition 3-4: A non-empty subset S of a vector space V is called a vector subspace of V if


and only if it satisfies the following two properties:
1. Closure under addition:
x,y∈ S implies x + y ∈ S
2. Closure under scalar multiplication:
a ∈ F, x ∈ S implies a·x ∈ S

Equivalently, S is a vector subspace of the vector space V over field F if and only if a·v1 + b·v2
is in S, for v1 ,v2 ∈ S and a,b ∈ F.

Definition 3-5: Let u = (u 0 , u 1 , …, un-1 ) and v = (v0 , v1 , …, vn-1) be vectors in the vector space
V over the field F. The inner product u·v is defined as

n −1
u·v = ∑
i= 0
u ivi = u0 v0 + u 1 v1 + … + u n-1vn-1

The inner product defined in a vector space V over F, has the following properties derived
from its definition:
i. commutative:
u·v = v·u, for all u,v ∈ V
ii. associative with scalar multiplication:
a·(v·u) = (a·v)·u
iii. distributive with vector addition:
u·(v + w) = u·v + u·w

If the inner product of two vectors v and u is v·u = 0, then v is said to be orthogonal to u or
equivalently u is orthogonal to v.

The inner product, which is a binary operation that maps pairs of vectors in the vector space V
over field F onto scalars in F, is used to characterise dual (null) spaces.

Given that a vector space V over F is a vector space with inner product, the dual space S- is
defined as the set of all vectors v in V such that u·v = 0, for all u ∈ S and for all v ∈ S- .

Note that S and S- are not disjoint since they both contain the vector the coordinates of which
are all zero, denoted by 0. Additionally, the Dimension Theorem imposes that the summation
of the dimensions of S and S- , is equal to the dimension of the vector space V.

11
Mathematical Background I

3.4 Matrices
It is common practice in error control systems to employ matrices in the encoding and
decoding processes.

A kx n matrix G over a Galois field GF(q) is a rectangular array with k rows and n columns
where each entry g ij is an element of GF(q) (i = 0..k – 1 and j = 0..n – 1).

If k ≤ n and the k rows of a matrix G are linearly independent, the q k linear combinations of
these rows form a k-dimensiona l subspace of the vector space Vn of all n-tuples over GF(q).
Such a subspace is called the row space of matrix G.

Furthermore, by using the notion of dual space, previously introduced, an important theorem
implies the existence of an (n – k)xn matrix H for each kx n matrix G with k linearly
independent rows. The precise statement of this theorem, Lin and Costello (1983, p. 47),
which will appear rather useful in section 4.3 concerning the matrix description of a code, has
as follows.

Theorem 3-4: For any k xn matrix G over GF(q) with k linearly independent rows, there exists
an (n – k)x n matrix H over GF(q) with (n – k) linearly independent rows such
that for any row g i in G and any hj in H, g i ·hj = 0. The row space of G is the
null (dual) space of H, and vice versa.

12
Linear Block Codes

4. LINEAR BLOCK CODES

Most known codes are block codes (or codes of fixed length). These codes divide the data
stream into blocks of fixed length which are then treated independently. There also exist
codes of non-constant length such as the convolutional codes which offer a substantially
different approach to error control. In these codes, redundancy can be introduced into an
information sequence through the use of linear shift registers which convert the entire data
stream, regardless of its length, into a single codeword. In general, encoding and decoding of
convolutional codes depends more on designing the appropriate shift register circuits and less
on mathematical structures. Therefore, in this study, attention is confined to block codes,
which invoke algebraic techniques mainly based on the theory of groups to insert redundancy
into the information sequence.

4.1 Block Coding


In block coding, the information sequence is segmented into message blocks of fixed length k.
These message blocks are encoded independently at the transmitter’s end, decoded in the
same manner at the receiver’s end and then combined to retrieve the original message.

Using an alphabet of q symbols, where the collection of these q symbols is considered a


Galois field of order q, GF(q), there are q k distinct message blocks. The encoder, according to
certain rules, transforms each message block m of length k into an n-tuple u which is the
codeword to be transmitted, as depicted in Figure 4–1.

m = m0 m1 … mk-1 → Encoder → u = u 0 u 1 … u n-1

Figure 4–1

The length n of the codeword is greater than k and these (n – k) additional digits, often
referred to as parity-check digits, are the redundancy added.

In order to ensure that the encoding process can be reversed in the receiver’s end in order to
retrieve the original message, there must be a one-to-one correspondence between a message
block and its corresponding codeword. This implies that there are exactly q k codewords. The
set of q k codewords of length n is called an (n,k) block code.

13
Linear Block Codes

4.1.1 Vector Representation


In the study of block codes it is useful to associate codewords with vectors. Each codeword
of length n, can be represented by a vector of dimension n

u = u 0 u 1 ...u n-1 ↔ u = (u 0 , u1 , ..., u n-1 )

A codeword u, which is an n-tuple is represented by vector u whose n coordinates are the


components of the codeword.

This representation allows us to exploit the algebraic structures introduced in Chapter 3 and
provides the basis for defining linear block codes.

4.2 Definition of Linear Block Codes


In general, encoding and decoding of q k codewords of length n may become prohibitively
complex processes for large n and k. All codewords need to be stored and searched through
for each received word. The linear algebraic structure of a major class of codes, called linear
block codes, can be exploited in designing their encoding and decoding schemes. The
inherent property of linearity of these codes means that they have mathematical structure
resulting in a reduction in the complexity of their implementation and analysis.

Based on the definition of linear block codes over GF(2) in Lin and Costello (1983, p. 52),
these codes can also be defined over a general finite field alphabet as follows.

Definition 4-1: A block code of length n and q k codewords is called a linear (n,k) code C, if
and only if its q k codewords form a k-dimensional subspace of the vector
space of all n-tuples over the field GF(q).

Based on the properties of vector spaces, discussed in section 3.3, it can be seen that the linear
combination of any set of codewords is a codeword. This implies that the sum of any two
codewords in C is a codeword in C. Another consequence of this is that linear codes always
contain the all-zero vector, 0, as a codeword.

4.3 Matrix Description


The vector representation of each codeword combined with Definition 4-1 allows for the
matrix description of an (n,k) code. The encoding and decoding processes can be reduced to

14
Linear Block Codes

matrix multiplication, as illustrated in the foregoing discussion, resulting in a substantial


decrease in complexity.

4.3.1 Generator Matrix


By Definition 4-1, an (n,k) code is a k-dimensional subspace of the vector space of all n-tuples
over GF(q). Thus, it is possible to find k linearly independent codewords g 0 , g 1 , …, g k-1 in C.
The set of these codewords forms a basis for code C since it consists of k linearly independent
elements of C. By use of Theorem 3-3, every codeword c in C is a linear combination of the
basis elements and thus can be written as

c = m0 g0 + m1 g1 + … + mk-1 g k-1

where {mi } i = 0..k – 1 are in GF(q). The above expression is valid for any codeword c in C,
implying that there is a one-to-one correspondence between the set of message blocks of the
form (m0 , m1 , ..., mk-1 ) and the codewords in C.

Therefore, all codewords in C can be formed by the linear combination of k linearly


independent codewords in C. As a result, for any (n,k) linear code there exists a kxn matrix G,
whose rows are these k linearly independent codewords

 g0   g 0, 0 g 0 ,1 . . . g 0, n−1 
   
 1
g  g1, 0 g1,1 . . . g1,n −1 
 .   
G =   =  . 
 .   
   . 
 .   . 
   
 g k −1   g k −1, 0 g k −1,1 . . . g k −1,n −1 

Clearly, the rows of G completely specify the code C. The matrix G is called a generator
matrix for code C and is used for encoding any message m = (m0 , m1 , ..., mk-1 ) as

 g0 
 
 g1 
 . 
c = m·G = [m0 m1 ... mk-1 ] ·   = m g +m g +…+ m g
 .  0 0 1 1 k-1 k-1

 
 . 
 
 g k −1 

15
Linear Block Codes

The generator matrix is central to the description of linear block codes since the encoding
process is reduced to matrix multiplication. In addition, only the k rows of G need to be
stored; the encoder merely needs to form a linear combination of the k rows based on the
input message m = (m0 , m1 , ..., mk-1 ). Likewise, the decoding process can be simplified by the
use of another matrix, the parity-check matrix , introduced next.

4.3.2 Parity-Check Matrix


A matrix description of the decoding process can be obtained by the use of Theorem 3-4 and
the notion of dual space described in section 3.3.

As stated in Theorem 3-4, for any k x n matrix G with q linearly independent rows there exists
an (n – k)x n matrix H with (n – k) linearly independent rows such that any vector orthogonal
to the rows of H is in the row space of G and thus a valid codeword.

 h0   h0, 0 h0,1 . . . h0,n −1 


 
 
 1 
h  
 .   h1, 0 h1,1 . . . h1, n−1 
 
H =   =
 . 
 . 
   
 . 
 . 
   . 
 h n− k −1   
hn− k −1, 0 hn− k −1,1 . . . hn− k −1, n−1 

Such a matrix H is called a parity-check matrix of the code and is used for decoding since a
received word v is a codeword if and only if v·HT = 0, where HT is the transpose1 of H. In
addition, the (n – k) linearly independent rows of H span an (n – k)-dimensional subspace of
the vector space of all n-tuples over GF(q). It can be seen that this is the dual space of the
vector space formed by the (n,k) code. Thus, H can be regarded as a generator matrix for an
(n,n – k) code. This code, with regard to the notion of dual space, is called the dual code C-
of C.

4.3.3 G, H in Systematic Form


The construction process of the generator matrix G and the parity check matrix H of a linear
code implies that they are not unique. Thus, choosing them to have as simple a form as

1
The transpose of H, is an n x (n – k) matrix whose rows are the columns of H and whose columns are
the rows of H.

16
Linear Block Codes

possible is a mathematical challenge. In addition, the problem of recovering the message


block from a codeword can be simplified, if G and H have a special form.

Consider a linear code C with generator matrix G. By applying elementary row operations 2
and column permutations it is always possible to obtain a generator matrix of the form

G = [Pkx(n-k) | Ik]

where Ik is the kxk identity matrix. Matrix G in the above expression is said to be in
systematic form.

Note that, by permuting the columns of G, code C may change but the new code C´ will differ
from C only in the order of symbols within codewords, allowing the two codes not to be
determined as essentially different. In fact, such codes are called equivalent codes.

If the generator matrix G is in systematic form, the message block during encoding is
embedded without modification in the last k coordinates of the resulting codeword. The
(n – k) first coordinates contain a set of (n – k) symbols that are linear combinations of certain
information symbols. This set is determined by matrix Pk x(n-k), which can be stored in the
read-only memory (ROM) of a PC.

c = m·G = [m0 m1 ... mk-1 ]·[Pkx(n-k) | Ik] = [c0 c1 ... cn-k-1 | m0 m1 ... mk-1 ]

Given a generator matrix G in systematic form, a corresponding parity-check matrix in


systematic form can be obtained as

H = [In-k | PT]

The use of G and H in systematic form has implementation advantages. An encoder of binary
systematic linear codes, according to Reed and Chen (1999, p. 82) can be implemented by a
logic circuit, as illustrated in Appendix D. Additionally, the decoding process is simplified
since a received word contains information in the last k positions and thus in case of correct
transmission, the original message can be reconstructed by simply extracting the last k
coordinates.

2
Elementary row operations are defined to be:
i. multiplication of a row by a non-zero constant
ii. replacement of a row ri with ri + a·rj , where i ≠ j and a ≠ 0
iii. row permutations (row reordering)

17
Linear Block Codes

4.4 Minimum Distance


The constructs developed in the previous section, are mainly concerned with the realisation of
encoding and decoding schemes. In this section, an important parameter that determines the
error detection / correction capacity of an error control code is introduced, called minimum
distance.

4.4.1 Definition
Effective codes tend to use codewords that are very unlike each other, since such a property is
required for applying maximum likelihood decoding. Clearly, there is a necessity to measure
how like or unlike two codewords are.

A notion of distance between two codewords known as Hamming distance is defined as


follows, van Lint (1999, p. 26).

Definition 4-2: If u and v are two n-tuples, then we shall say that their Hamming distance is

d H (u,v) = | { i | 0 ≤ i ≤ n – 1, u i ≠ vi } |

In short, the Hamming distance is the number of places (i = 0..n – 1) in which two codewords
differ. It is a metric; that is, it satisfies the following properties that a distance function must
satisfy in the set of codewords of a code C:
1. d H (u,v) ≥ 0 for all n-tuples u,v
2. d H (u,v) = 0 if and only if u = v
3. d H (u,v) = d H (v,u)
4. For any three n-tuples u, v, w in C:
d H (u,w) ≤ d H (u,v) + d H (v,w) (triangle inequality)

The Hamming distance d H is calculated for all possible pairs of codewords of a code C and
much attention is given to its minimum value. The minimum distance, d, among all
codewords is considered to be the least Hamming distance, Reed and Chen (1999, p. 85).

Definition 4-3: The minimum distance of a block code C is the minimum Hamming distance
between all distinct pairs of codewords in C

d = min(d H(u,v)), for all u,v in C

18
Linear Block Codes

Calculating distances among q k codewords is a quite tedious task, which may be simplified by
the notion of weight defined as follows, van Lint (1999, p. 33).

Definition 4-4: The weight w(v) of any vector (codeword) v = v0 v1…vn-1 is defined by

w(v) = d H (v,0)

where we denote (0, 0, …, 0) by 0.

In other words, the weight w(v) of a codeword v is the number of non-zero coordinates in v.

Now, the minimum distance of a code can be obtained by using the following theorem, Lin
and Costello (1983, p. 63).

Theorem 4-1: The minimum distance of a linear block code is equal to the minimum weight
of its non-zero codewords.

Consequently, finding the minimum distance of a code requires the weight structure of q k
codewords rather than computing the Hamming distance for all q 2k pairs of codewords.

Another way to determine the minimum distance of a code is by using the parity-check matrix
H, described in section 4.3.2. It has been seen that if c is a codeword then c·HT = 0. Further,
c·HT can be written as a linear combination of the columns of H or the rows of HT. Thus, the
equation c·HT = 0 implies that the columns of H are linearly dependent. These results lead to
the following theorem, Jones and Jones (2000, p. 131), which provides an alternative way of
determining the minimum distance of a code.

Theorem 4-2: Let C be a linear code of minimum distance d and let H be a parity-check
matrix for C. Then d is the minimum number of linearly dependent columns
of H.

4.4.2 Minimum Distance and Error Detection / Correction


As mentioned earlier, the minimum distance of an (n,k) code determines its error detecting /
correcting capacity.

If d is the minimum distance of a code, any two distinct codewords differ at least in d
coordinates. For such a code, a received word with (d – 1) or fewer errors cannot be matched

19
Linear Block Codes

to a codeword. Hence, the code is capable of detecting all (d – 1) or fewer errors that may
occur during transmission.

When applying maximum likelihood decoding, incorrect decoding may occur whenever a
received word is closer, in Hamming distance, to an incorrect codeword than to the correct
codeword. For an (n,k) code with minimum distance d, all incorrect codewords are at least
distance d from the transmitted codeword. This implies that incorrect decoding may be the
case only when at least d / 2 errors occur. This is stated in an equivalent fashion in the
following theorem, Pless (1998, p. 11).

d −1
Theorem 4-3: If d is the minimum distance of a code C, then C can correct t =+ +or
2
fewer errors3 , and conversely.

Equivalently, the above theorem imposes the condition that d and t satisfy the inequality
d ≥ 2t + 1 for a code to be t-error-correcting. That is, a code can correct any word received
with errors in at most t of its symbols.

It is also possible to employ codes, which can simultaneously perform correction and
detection. Such modes of error control, often referred to as hybrid modes of error control, are
commonly used in practice. It can be shown, by use of geometric arguments according to
Reed and Chen (1999, p. 86) that a code with minimum distance d can correct t errors and at
the same time detect l errors where t, l and d satisfy t + l + 1 ≤ d.

A Hamming sphere of radius t contains all possible received words that are Hamming
distance t or less from a codeword. If a received word falls within a Hamming sphere it is
decoded as the codeword in the center of the sphere. The common radius t of these disjoint4
spheres is desired to be large in order to achieve good error correcting capacity. Yet, to attain
good information rate R, the number of these spheres M must also be large resulting in a
conflict, since the spheres are disjoint. Thus, there is a limit on the number of spheres, and
consequently codewords, that can be used. An upper bound, known as Hammming’s Sphere-
packing Bound, addresses the problem as stated in the following theorem, Pless (1998, p. 23).

3
+ y+ is the largest integer less than or equal to y (floor function).
4
The requirement that the Hamming spheres are disjoint allows only one candidate codeword for each
received word (unambiguous decoding).

20
Linear Block Codes

Theorem 4-4: If C is a code of length n with dimension k, minimum distance d and M = q k


codewords, then

 n   n  n  n
(  +  (q − 1) +   (q − 1)2 + ... +  (q − 1)t ) q k ≤ q n
 0  1   2 t 

Hence, given n and k this expression bounds t and so bounds d. An important result that rests
on Theorem 4-4 introduces the notion of t-perfect codes, as presented by Jones and Jones
(2000, p. 108).

Theorem 4-5: Every t-error-correcting linear (n,k) code C over GF(q) satisfies

t
 n
∑  i (q − 1) ≤ q n− k
i

i= 0

A linear code C is t-perfect if it attains equality in the above theorem. Note that the inequality
in Theorem 4-5 can be obtained by dividing the inequality in Theorem 4-4 by the number of
all codewords, q k, since C is of dimension k. Based on this condition for t-perfect codes,
Golay constructed the two perfect codes, G11 and G23 , as will be described in section 5.2.

From the discussion argued in this section, it can be concluded that the minimum distance is
central to the structure of error detecting / correcting codes. Its tight relation to the error
control capacity of a code, motivates the search for codes with a large minimum distance for
given length n and dimension k.

4.5 Error Processing for Linear Codes


Effective codes are highly associated with applying an efficient decoding algorithm. In
general, decoding schemes designed for a specific code seem to be efficient for the certain
code or class of codes, rather than for any code. In short, there is no ‘one size fits all’ for
decoding algorithms. However, syndrome decoding is a method that achieves comfortable
efficiency for most codes. Therefore, it can be used as a common platform for comparison
with a decoding scheme for a specific code or family of codes. It is a method of error
processing for linear codes that always produces a closest codeword - that is, complete
decoding - by using the parity-check matrix for efficient implementation of maximum
likelihood decoding.

21
Linear Block Codes

4.5.1 Standard Array


The first stage for implementing the decoding process for linear codes is the construction of
an array T contain ing all the words over the finite field GF(q), which is the alphabet of the
code. Put simply, T consists of all n-tuples with components from the field GF(q). This array
is constructed as follows.

Step 1: The first row of T consists of all the codewords in any order with the restriction that
the first word is 0, the all-zero vector, which has all entries equal to zero.

Step 2: The i-th row is formed by choosing an n-tuple in GF(q) that has not appeared yet in T
and placing it in the first column. This word is called the row leader. The rest of the
row is determined by adding the codeword at the head of each column – it is an
element of the first row – to the row leader.

Following the above steps a q kxq n-k array T is formed, called a standard array for the linear
(n,k) code. Clearly, T contains all words of length n from GF(q) where q is the number of
code symbols and n is the length of the code.

From the way in which the standard array is constructed – that is, every codeword is the sum
of its row leader and the codeword at the head of its column – it follows that the horizontal
differences in a standard array are codewords. This property enables a testing of whether two
codewords u and v lie in the same row of the standard array or not, according to the following
theorem, Pretzel (1996, p. 50).

Theorem 4-6: Let C be a linear (n,k) code and T a standard array for C. Then two entries u
and v lie in the same row of T if and only if their difference u – v is a
codeword.

Error processing with the standard array is performed by replacing each received word with
the codeword at the head of its column. An important property that rests on the above
theorem is that every word of length n for C, occurs exactly once in a standard array for code
C. As a result, standard array decoding is complete and unambiguous. By ‘complete’ is
meant that every received word is always matched to a codeword and ‘unambiguous’ refers to
the fact that the received word clearly determines the corresponding codeword, not allowing
any arbitrary choice.

22
Linear Block Codes

The set of elements that appear in the same row is called a coset. Thus, the cosets of a code
can be defined as the rows of the standard array. These cosets are the same for any possible
standard array of a code, because the test whether two entries lie in the same row, is based on
the set of codewords of the code and not on the actual choice of the array. The entries in the
first row may be in different order, but the sets of elements are the same. Consequently, the
rest of the rows in the standard array have the same entries, even though their order may differ
since ordering depends on the choice of row leader for each row. This property of the
standard array is important, as the error processor associates the whole row to its row leader.

Once the error processor using a standard array detects an erroneous received word, instead of
assuming which codeword was sent, it attempts to guess which error occurred. Then, based
on the fact that an incorrect received word v equals to the transmitted codeword u plus an
error pattern e, where e = v – u, the word v is corrected to the codeword u. To accomplish
that, the error patterns are chosen to be the row leaders of the standard array. In this way, the
word at the head of the column containing e is 0. Thus, e – 0 = v – u, where u is the word at
the head of the column containing v. Hence, u = v – e and the transmitted codeword u is
determined.

Since a standard array for a linear (n,k) code cannot always have all possible error patterns as
row leaders, decoding with a standard array does not enable correction of all errors. In an
attempt to ease this deficiency of the standard array, the row leaders can be chosen to be of
minimal weight among all candidates. In this case, the row leaders are called coset leaders.
However, the choice may be arbitrary in many cases leading to different versions of the
standard array, but the cosets will consist of the same words for the same coset leader in a
different form of the standard array. In this way, each received word will be corrected to a
closest codeword. The above argument is justified in the following theorem, Pretzel (1996,
p. 54).

Theorem 4-7: Let C be a linear (n,k) code over A with a standard array T in which the row
leaders have the smallest possible weight. Let u be a word of length n and let
v be the codeword at the head of its column. Then for any codeword w the
distance d(u,w) is greater than the distance d(u,v).

4.5.1.1 Limitations of the Standard Array Decoding


Error processing with the standard array is a relatively simple method with simple practic al
implementation, since it provides a direct match between the received word and the
corresponding codeword. However, there are certain limitations. The size of the standard

23
Linear Block Codes

array often becomes prohibitive. Codes need long block length in order to achieve multiple
error correction, resulting in standard arrays with, say 2100 entries, which is not feasible on
computers, in terms of storage space and required time for searching for a specific entry.
Additionally, in the standard array decoding technique, the error patterns corrected are the
row (coset) leaders. As a result, if two error patterns lie in the same row (coset), at most one
of them can be corrected.

With regard to the limitation in the error correction capabilities of the code imposed by the
standard array, a set of error patterns is chosen, usually the set of all errors up to some given
weight. The elements of the set are chosen to be the coset leaders, therefore lie in distinct
rows allowing the standard array to correct all of them and no more. In short, the
corresponding code can correct all error patterns of a certain weight so long as they appear as
coset leaders. Regarding the size of the standard array, the notion of syndromes of received
words reduces the large standard array to a table that consists of two rows, as discussed next,
resulting in saving storage space.

4.5.2 Syndromes of Words


The underlying relationship between the standard array and the check matrix of a code, can be
exploited to reduce the size of the standard array and determine which errors the code can
correct. Recall that the basic property of the check matrix is that a received word v is a
codeword if v·HT = 0. It follows that two words u and v lie in the same coset if and only if
u·HT = v·HT. In the standard array decoding method, the main interest is in specifying the row
that contains the received word and then, the error pattern is assumed to be the row leader.
Since v·HT determines the row of the received word v, there is no need to store all the
elements but only this value, which is the same for all elements of a specific row and its row
leader. Thus, the value v·HT, determines the error pattern and according to Pretzel (1996,
p. 56) can be defined as follows.

Definition 4-5: Given a linear (n,k) code C and a check matrix H, the syndrome of a word v
of length n is v·HT.

Consequently, the standard array can be reduced to a table containing the row leaders and
syndromes only, resulting in reduced storage space while the search to locate a received word
is less time-consuming.

24
Linear Block Codes

The received words and the corresponding error patterns have the same syndromes. Indeed,
supposing that codeword u is transmitted and v = u + e is received, where e is the error
pattern, the syndrome of v is

v·HT = (u + e)·HT = u·HT + e·HT = 0 + e·HT

since u ∈ C and u·HT = 0 because u is a codeword. The basic property of the check matrix can
be restated using the notion of syndromes by saying that a received word v is a codeword if
and only if its syndrome v·HT = 0. The syndrome of each received word is computed and
located in the syndrome list of the syndrome table. Then, the corresponding coset leader is
the error pattern to be subtracted from the received word. In this way, the actual codeword
sent is reconstructed.

For example, let G be the generator matrix for a binary code C,

1 0 1 0 
G=  
0 1 1 1 

and H can be taken to be the check matrix,

1 1 1 0 
H=  
0 1 0 1

The first row of a possible standard array consists of the codewords with 0 first,

codewords 0000 1010 0111 1101


cosets 1000 0010 1111 0101
0100 1110 0011 1001
0001 1011 0110 1100

The syndrome table consisting of row leaders and their syndromes,

coset leaders syndromes

0000 00
1000 10
0100 11
0001 01

25
Linear Block Codes

Supposing that u = 1101 is transmitted and v = 1100 is received, the syndrome of v is


computed as v·HT = [0 1]. Syndrome 01 is at the fourth row of the table and its corresponding
error pattern (coset leader) is e = 0001. Hence, the correct codeword is determined by
subtracting the error pattern from the received word, hence u = v – e = 1101, which is the
actual codeword transmitted.

Note that the error pattern 0010 was not chosen as a coset leader, because it had already
appeared in a coset, at the second row, of the standard array. The syndrome of the coset
leader 1000 is identical to the first column of the check matrix. Likewise, the syndrome of
0100 is the second column of H and that of 0001 is the fourth column of the check matrix.
The position of 1 in the error pattern indicates the column and consequently the position of
error in the received word. In the previous example, the error pattern was 0001 and the error
symbol was in the fourth position of the received word.

4.5.2.1 Observation
Syndrome decoding is more efficient than standard array decoding in terms of storage space
and speed of error processing. Considering a binary (100, 60) code, for standard array
decoding there is a need to store 2100 entries and search through 260 vectors to locate a
received word. By implementing syndrome decoding, the requirement is to store and search
through 240 coset leaders and their syndromes, resulting in reduced storage space and a less
time-consuming process.

26
Hamming Codes - Golay Codes - RM Codes

5. HAMMING CODES – GOLAY CODES – RM CODES

5.1 Hamming Codes


Hamming codes – discovered by Hamming shortly after Shannon’s Theorem motivated the
search for error detecting / correcting codes – comprised the first family of codes capable of
correcting errors. These codes have been used for error control in digital communication
systems and computer memories.

As described by Reed and Chen (1999, p. 104), a binary Hamming code can be constructed
for any positive integer m ≥ 2, with the following parameters.

code length n = 2m– 1


information symbols k = 2m – m – 1
minimum distance d=3

Hamming codes are determined by their parity-check matrices. The parity-check matrix H of
a binary Hamming code, consists of all non-zero m-tuples as its columns which are ordered
arbitrarily.

The smallest number of distinct non-zero binary m-tuples that are linearly dependent is three
and this can be justified as follows. Since the columns of H are non-zero and distinct, no two
columns add to zero. In addition, H has all the non-zero m-tuples as its columns and thus the
vector sum of any two columns must be a column of H as well. This implies that
hi + hj + hk = 0, which means that at least three columns in H are linearly dependent. It follows
from Theorem 4-2 that the Hamming codes always have minimum distance three. Further,
Theorem 4-3 implies that Hamming codes can always correct a single error.

Hamming codes can be decoded with syndrome decoding, developed in section 4.5. Suppose
that r is the received word and e is the error pattern with a single error in the j-th position.
When the syndrome of r is computed, the transposition of the j-th column of H is obtained as
demonstrated in the following expression

T
s = r·HT = e·HT = h j

Thus, the decoding process can be performed as follows:

27
Hamming Codes - Golay Codes - RM Codes

Step 1: Compute the syndrome s


If s = 0 then go to Step 4

Step 2: Determine the position j of the column of H which is the transposition of the
syndrome

Step 3: Complement the bit in the j-th position in the received word

Step 4: Output the resulting codeword

5.1.1 Programming the [7,4] Hamming Code


It has been seen that an [n,k] Hamming code can be constructed for any positive integer m ≥ 2.
By choosing m = 3 the resulting code with the following parameters,

code length n=7


information symbols k=4
minimum distance d=3

is the [7,4] Hamming code. This code encodes message words of length 4 as codewords of
length 7. A table of all 24 message words and their corresponding codewords, taken by Lin
and Costello (1983, p.67) was rather useful for testing the programs presented in the
following sections and is included in Appendix E. This table can be constructed by using the
following equations (encoding rule) based on Pretzel (1996, p. 67):

c1 = m1 + m3 + m4
c2 = m1 + m2 + m3 (5.1.1)
c3 = m2 + m3 + m4

where {mi } denote the coordinates of the message word m = (m1 m2 m3 m4 ). The
corresponding codeword is formed as c = (c1 c2 c3 c4 c5 c6 c7 ) where c1 , c2 , c3 are determined by
the above equations and c4 = m1 , c5 = m2 , c6 = m3 and c7 = m4 .

For example, the message word m = (0 1 0 1) is encoded as c = (1 1 0 0 1 0 1) since

c1 = m1 + m3 + m4 = 0 + 0 + 1 = 1
c2 = m1 + m2 + m3 = 0 + 1 + 0 = 1
c3 = m2 + m3 + m4 = 1 + 0 + 1 = 0

28
Hamming Codes - Golay Codes - RM Codes

Note that the information symbols – that is, the message word coordinates – occupy the last 4
positions of the corresponding codeword and therefore, equations (5.1.1) perform systematic
encoding for the [7,4] Hamming code.

5.1.1.1 Perl Program for Systematic Encoding


This program, written in Perl programming language, performs systematic encoding for the
[7,4] Hamming code. It is based on the encoding rule described above and uses the
corresponding generator matrix G in systematic form,

1 1 0 1 0 0 0
0 1 1 0 1 0 0
G= 
1 1 1 0 0 1 0
 
1 0 1 0 0 0 1

Though comments have been added to the source code for clarity, a brief description of the
program follows.

The program takes a message word as input from the user. This message word, consisting of
4 bits, is encoded through matrix multiplication by the generator matrix G of the code, where
the operations are performed modulo 2. The output of the program is the corresponding
codeword.

#!/usr/bin/perl -w
#HamSysEncoder.pl
#the program performs systematic encoding for the [7,4] Hamming Code
#it uses the generator matrix G in systematic form

#initialisation of generator matrix G in systematic form


while (){
@g0=qw(1 1 0 1 0 0 0);
@g1=qw(0 1 1 0 1 0 0);
@g2=qw(1 1 1 0 0 1 0);
@g3=qw(1 0 1 0 0 0 1);

#input message word from user (any binary 4-tuple)


#assign it to @message
print "Enter a message word using spaces between the bits:\n";
chomp($mword=<STDIN>);

29
Hamming Codes - Golay Codes - RM Codes

if (!(defined($mword))){
print "Enter a message word first:\n";
}
else {
@message=split(/\s+/,$mword);
}

#multiply message word by matrix G

#for each message word coordinate=0, make the corresponding G row=0


#else leave it as it has
if ($message[0]==0){
for($j=0;$j<7;$j++){
$g0[$j]=0;
}
}
if ($message[1]==0){
for($k=0;$k<7;$k++){
$g1[$k]=0;
}
}
if ($message[2]==0){
for($l=0;$l<7;$l++){
$g2[$l]=0;
}
}
if ($message[3]==0){
for($h=0;$h<7;$h++){
$g3[$h]=0;
}
}

#codeword coordinates produced by adding the rows of G coordinatewise

for($n=0;$n<7;$n++){
$codeword[$n]=$g0[$n]+$g1[$n]+$g2[$n]+$g3[$n];
if (($codeword[$n]==3)||($codeword[$n]==1)){
$codeword[$n]=1;
}
else {
if (($codeword[$n]==0)||($codeword[$n]==2)||($codeword[$n]==4)){

30
Hamming Codes - Golay Codes - RM Codes

$codeword[$n]=0;
}
}
}

#output corresponding codeword (to be transmitted)


print "The corresponding codeword is u = ";
for ($i=0;$i<7;$i++){
print "$codeword[$i] ";
}

#continue encoding?
print "\n\nDo you want to encode another message word? (Y/N) ";
chomp($yesorno=<STDIN>);
while (($yesorno ne 'Y')and($yesorno ne 'y')and($yesorno ne 'N')
and($yesorno ne 'n')){
print "Enter Y or N\n";
chomp ($yesorno=<STDIN>);
}
if (($yesorno eq 'N') or ($yesorno eq 'n')){
last;
}
}

5.1.1.2 Perl Program for Syndrome Decoding


The following program, written in Perl programming language, performs syndrome decoding
for the [7,4] Hamming code, based on the principles described in section 4.5 and specifically
on the four-step decoding process for Hamming codes presented in section 5.1.

Given the generator matrix G, used for systematic encoding in the previous section, the
corresponding parity-check matrix H in systematic form for the [7,4] Hamming code is of the
form

1 0 0 1 0 1 1 
H = 0 1 0 1 1 1 0 
 
0 0 1 0 1 1 1 

31
Hamming Codes - Golay Codes - RM Codes

The table of Appendix E, and in particula r its second column, can be used for testing the
program. A brief description of the program follows.
The program takes a received word as input from the user. This word of length 7 is
multiplied by the transpose of the parity-check matrix H to obtain its syndrome for error
detection / correction. All single errors are detected and corrected. Double errors are
detected but cannot be corrected, as imposed by the [7,4] Hamming code. The output of the
program is the actual transmitted codeword.

#!/usr/bin/perl -w
#HamSyndDecoder.pl
#the program performs syndrome decoding for [7,4] Hamming code
#detects and corrects ALL single errors
#detects but cannot correct double errors (results in incorrect
#decoding)

#initialisation of parity-check matrix H for [7,4] Hamming code


while(){
@h0=qw(1 0 0 1 0 1 1);
@h1=qw(0 1 0 1 1 1 0);
@h2=qw(0 0 1 0 1 1 1);

#input received word from user and assign it to @rwbits


print"\nEnter the received word using spaces between the bits: ";
chomp($rword=<STDIN>);
@rwbits=split(/\s+/,$rword);

#compute the syndrome s of the received word in 3 steps


#multiply the received word by transpose of matrix H

#first syndrome coordinate


#multiply received word by first column of transpose of H
for ($j=0;$j<7;$j++){
$mul1[$j]=$rwbits[$j]*$h0[$j];
}
#add the subsums to obtain the first syndrome coordinate
$c1=0;
for ($a=0;$a<7;$a++){
$c1=$c1+$mul1[$a];
}
#perform modulo 2 addition

32
Hamming Codes - Golay Codes - RM Codes

if (($c1==1)||($c1==3)||($c1==5)||($c1==7)){
$c1=1;
}
else {
$c1=0;
}
$syndrome[0]=$c1;

#second syndrome coordinate


#multiply received word by second column of transpose of H
for ($k=0;$k<7;$k++){
$mul2[$k]=$rwbits[$k]*$h1[$k];
}
$c2=0;
for ($b=0;$b<7;$b++){
$c2=$c2+$mul2[$b];
}
if (($c2==1)||($c2==3)||($c2==5)||($c2==7)){
$c2=1;
}
else {
$c2=0;
}
$syndrome[1]=$c2;

#third syndrome coordinate


#multiply received word by third column of transpose of H
for ($l=0;$l<7;$l++){
$mul3[$l]=$rwbits[$l]*$h2[$l];
}
$c3=0;
for ($c=0;$c<7;$c++){
$c3=$c3+$mul3[$c];
}
if (($c3==1)||($c3==3)||($c3==5)||($c3==7)){
$c3=1;
}
else {
$c3=0;
}
$syndrome[2]=$c3;

33
Hamming Codes - Golay Codes - RM Codes

#detect and locate error position


#check if syndrome is equal to a column of H to obtain error position
#if syndrome is zero, no errors occured during transmission
for ($e=0;$e<7;$e++){
if(($syndrome[0]==$h0[$e])&&($syndrome[1]==$h1[$e])&&
($syndrome[2]==$h2[$e])){
push(@errorposition,$e);
}
}
print "\n\nThe syndrome of the received word is: @syndrome";

#error correction

#two errors occurred, retransmission required


if (@errorposition){
$numerrors=@errorposition;
if ($numerrors==2){
print "\nErrors occurred. Make request for retransmission\n";
$empty=pop(@errorposition);
$empty=pop(@errorposition);
}
#one error occurred, complement the error position bit in @rwbits
if ($rwbits[$errorposition[$#errorposition]]==0){
$rwbits[$errorposition[$#errorposition]]=1;
}
else {
$rwbits[$errorposition[$#errorposition]]=0;
}

#output correct codeword


$output=$errorposition[$#errorposition]+1;
print "\n\nError occurred in coordinate $output of the received
word";
print "\nThe received word was corrected to: @rwbits\n";
#empty @errorposition for next iteration
$empty=pop(@errorposition);
}
#no error occurred, simply confirm correct transmission
else {
print "\n\nNo errors occurred during transmission!!";

34
Hamming Codes - Golay Codes - RM Codes

#continue decoding?
print "\n\nDo you want to decode another received word? (Y/N)";
chomp ($yesorno=<STDIN>);
while (($yesorno ne 'Y')and($yesorno ne 'y')and($yesorno ne 'N')
and($yesorno ne 'n')){
print "Enter Y or N\n";
chomp ($yesorno=<STDIN>);
}
if (($yesorno eq 'N')or($yesorno eq 'n')){
last;
}
}

5.2 Golay Codes


Golay's observation made upon calculating and factorising the number

11 11 2
1 +   2 +   2
1  2 

was that a perfect linear code could have as parameters q = 3, n = 11, k = 6 and t = 2, since the
above number is 243 which is equal to 35 . Recall that a linear code is perfect if it attains
equality in Theorem 4-5. To construct such a code, Golay considered the following parity-
check matrix, Jones and Jones (2000, p. 137),

 
 1 1 1 2 2 0 1 0 0 0 0 
 1 1 2 1 0 2 0 1 0 0 0 
 
H =  1 2 1 0 1 2 0 0 1 0 0 
 1 2 0 1 2 1 0 0 0 1 0 

 1 0 2 2 1 1 0 0 0 0 1 
 

with n = 11 columns and n – k = 5 independent rows. Though a tedious task, it can be proved
that there are no sets of four linearly dependent columns, while there is a set of five.
Consequently, the minimum distance is d = 5, hence t = 2 according to Theorem 4-3. The
code C defined by H, attains equality in Theorem 4-5 since

35
Hamming Codes - Golay Codes - RM Codes

t
n 11  11 2
∑  i  (q – 1)i = 1 + 1 2 +   2 = 243 = 35
i= 0  2 

and 35 = q k the code is perfect and is called the ternary Golay code G11 of length 11.
Inspired by the following equation

 23   23   23 
1 +   +   +   = 2048 = 211
1   2   3 

and motivated by Theorem 4-5, Golay took q = 2, n = 23, k = 12 and t = 3 to construct the
binary Golay code G23 of length 23. Similarly, a parity-check matrix H with 23 columns and
11 independent rows was used, which had seven as minimum number of linearly dependent
columns. Thus, this code has minimum distance d = 7 and from Theorem 4-3, it follows that
t = 3 making the G23 a 3-error-correcting perfect code. Recall that it is perfect, since it attains
equality in Theorem 4-5.

5.3 Reed-Muller Codes


The Reed-Muller (RM) codes are a subclass of binary block codes, introduced by D.E. Muller
and I.S. Reed (1954). Their main advantage is the relatively simple implementation of the
decoding process. The code R(5) was used by ‘Mariner 9’ to transmit black and white
photographs of Mars in 1972.

5.3.1 Definition
For each positive integer m and each integer r satisfying 0 ≤ r ≤ m, the r-th order Reed-
Muller code R(r,m) is a binary linear code with parameters

1+ m  +... +  m 
1  r 
n = 2m , M =2 , d = 2m-r

The special case R(0,m) is the repetition code. More attention is given to the first order
Reed-Muller codes (r = 1) which are binary linear codes. Among several ways of defining
the Reed-Muller codes, R(m), Roman (1997, p. 234) takes the following approach.

Definition 5-1: The Reed-Muller codes R(m) are binary codes defined for all integers m ≥ 1,
as follows:

1. R(1) = Ζ 22 = {00, 01, 10, 11}

36
Hamming Codes - Golay Codes - RM Codes

2. For m ≥ 1, R(m) = {uu | u ∈ R(m)} ∪ {uuc | u ∈ R(m)}

In words, the codewords in R(m + 1) are formed by juxtaposing each codeword in R(m) with
itself and with its complement.
For example, given that R(1) = {00, 01, 10, 11}, according to the previous definition, the R(2)
code will consist of the codewords

R(2) = {0000, 0011, 0101, 0110, 1010, 1001, 1111, 1100}

5.3.2 Properties of RM Codes


For m ≥ 1 the Reed-Muller code R(m) is a linear (2m , 2m+1 , 2m-1 ) code for which every
codeword except 0 and 1 has weight 2m-1 . Note that R(2) is a linear (22 , 23 , 21 ) code in which
every codeword except 0 and 1 has weight 22-1 = 2. Likewise, R(1) is a linear (21 , 22 , 20 ) code
with weight 21-1 = 1 for all codewords except 0 and 1.

Furthermore, the above definition leads to the construction of a generator matrix for R(m)
based on the generator matrix for R(1), as stated in the following theorem, Roman (1997,
p. 236).
0 1
Theorem 5-1: A generator matrix for R(1) is R1 =  
1 0
If Rm is a generator matrix for R(m), then a generator matrix for R(m + 1) is

 0…0 1…1 
Rm+1

= 


 

The first row of a generator matrix Rm for R(m) consists of 2m-1 0s followed by 2m-1 1s. The
i-th row of Rm , is formed by alternating blocks of 0s and 1s of length 2m-i . As for the m-th row
of Rm it consists of alternating 0s and 1s, since the blocks are of length 2m-m = 20 = 1. The last
row consists of all 1s.

For example, the generator matrix for R(3) is

 0 0 0 0 1 1 1 1 
 
0 0 1 1 0 0 1 1 
R3 = 
 0 1 0 1 0 1 0 1 
 
 1 1 1 1 1 1 1 1 

37
Hamming Codes - Golay Codes - RM Codes

5.4 Hadamard Codes


A class of codes, namely Hadamard codes, can be constructed by applying matrix theory. In
particular, a class of matrices called Hadamard matrices, is the basis for such codes.
In an attempt to find the maximum determinant of a real n xn matrix H = (h ij ), denoted by
detH, for a given value of n, Hadamard bounded the entries of H such that | hij | ≤ 1 and
proved that | detH | ≤ n n/2 .

Furthermore, Hadamard proved that:


i. each h ij = ± 1
ii. distinct rows ri of H are orthogonal; that is, ri ·rj = 0 for every i,j ∈ N
if and only if | detH | = n n/2 .

An nx n matrix H that satisfies both above conditions is called a Hadamard matrix of order n.

For example,
 1 1 
H =  
 1 −1 

is a Hadamard matrix of order 1.

As proven in the next theorem, Hadamard matrices generate binary codes, Jones and Jones
(2000, p. 116).

Theorem 5-2: Each Hadamard matrix of order n gives rise to a binary code of length n, with
M = 2n codewords and minimum distance d = n / 2.

Indeed, if 2n vectors ±r1 , ±r2 , ..., ±rn are formed from the ri rows of H, the orthogonality in
condition (ii) implies that these vectors are distinct. If each entry of –1 is replaced by 0, the
2n vectors can be regarded as elements of GF(2), generating a binary code. Its codewords are

u1 , u 1 , …, un , u n where u = 1 – u and d(ui ,u i ) = n, since they differ in all coordinates.


Condition (ii) of Hadamard matrices sets the minimum distance for such a binary code to be
d = n / 2. Any code derived by Theorem 5-2 is called a Hadamard code of length n. The
Mariner 1969 space-probe used a Hadamard code of length 32 to transmit pictures.

38
Mathematical Background II

6. MATHEMATICAL BACKGROUND II

The error detecting / correcting codes discussed so far, were constructed using algebraic
structures introduced in Chapter 3. In this Chapter, a larger portion of the available algebraic
tools is presented, since application of more advanced algebraic techniques leads to a
significant increase in the power of the resulting codes. Such a class of codes are the so
called cyclic codes, presented in the following Chapter. These codes can correct multiple
errors and can cope with error bursts. That is, errors that do not occur entirely independently
of each other, but affect several neighbouring bits. For instance, a 2.5mm scratch on a
compact disk accounts for approximately 4,000 erroneous bits. The class of cyclic codes
includes certain families of codes such as, BCH and Reed-Solomon codes, which are widely
adopted in practical applications. In order to describe these codes, it is necessary to consider
finite fields.

6.1 Structure of Finite Fields


6.1.1 Basic Properties of Finite Fields
In the study of error detecting / correcting codes the interest is in polynomials with
coefficients taken from a finite field. By Definition 3-3, a field is a set F whose elements
form an additive commutative group while its non-zero elements form a multiplicative
commutative group and the distributive law holds.

Consider the set of integers {0, 1, 2, …, p – 1}. Under addition modulo p these elements form
an additive commutative group, by Theorem 3-1. Under multiplication modulo p the subset
of elements {1, 2, …, p – 1} forms a multiplicative commutative group, by Theorem 3-2. If
the two operations are allowed to distribute, which is the case in integer arithmetic, then the
above set is a field. Hence, the integers {0, 1, 2, …, p – 1} where p is a prime, form a finite
field of order p under modulo p addition and multiplication, denoted by GF(p).

Finite fields of order p m , p a prime and m a positive integer, can be constructed as vector
spaces over the prime-order field GF(p), as it shall be demonstrated in section 6.1.3.

The order of a finite or Galois field element is of significant importance since it determines
some of its basic properties.

Consider the finite field GF(q) and the following sequence of elements

1, ß, ß2 , ß3 ,…

39
Mathematical Background II

where ß is an element of GF(q) and 1 is the multiplicative identity. Since ß ∈ GF(q) the
property of closure under multiplication implies that all successive powers of ß must also be
in GF(q). However, GF(q) has only a finite number of elements. Thus, for some power of ß
the sequence begins to repeat values found earlier in the sequence. The first element to repeat
must be 1 and this can be proven by contradiction. Assuming that ßx ? 1 is the first repeating
element and that it repeats ßy, where 0 < y < x, we get ßx = ßy. It follows that ßx-y = 1 is the
first repeating element, where 0 < x – y < x. Since ßx ? 1 was assumed to be the first repeating
element, there is a contradiction. Thus, the first element to repeat is element 1.

The above concept is related to the order of a Galois field element, which is defined as
follows.

Definition 6-1: Let ß be an element in GF(q). The order of ß, denoted by ord(ß), is the
smallest positive integer t such that ßt = 1.

Note that for a Galois field element ß, ord(ß) is defined using the multiplicative operation
whereas the order of a group element is defined using the additive operation. Furthermore,
unlike a group, the order of a finite field completely specifies the field. A finite field of order
q, GF(q), is unique up to isomorphisms 8 . Thus, two finite fields of the same order are always
identical up to labelling of their elements regardless of how the fields were constructed. The
order of a non-zero element ß in GF(q) must satisfy certain requirements, as dictated by
Theorems 6-1 and 6-2, Wicker (1995, p. 34).

Theorem 6-1: If ord(ß) = t for some ß ∈ GF(q) then t | (q – 1)


(where t | (q – 1) means that t divides (q – 1)).

Theorem 6-2: Let a and ß be elements in GF(q) such that ß = a i. If ord(a) = t then
ord(ß) = t / GCD(i,t)
(where GCD(i,t) denotes the greatest common divisor of i and t).

Theorem 6-1 states that the order of an element ß in GF(q) must be a divisor of (q – 1). For
example in GF(16) the elements can only have orders {1, 3, 5, 15}. It is possible to
determine the number of elements in the field of any given order. This information is

8
Two fields F and F´are called isomorphic if there exists a map f from F to F´ such that:
i. f is bijective
ii. for any a,b ∈ F, f (a·b) = f (a)·f (b) and f (a + b) = f (a) + f (b)
The map f is called an isomorphism.

40
Mathematical Background II

obtained by the Euler f Function defined as the number of integers in the set {1, …, t – 1}
that are relatively prime 9 to t. Hence,

f (t) = | {1 ≤ i < t | GCD(i,t) = 1} |

Theorem 6-1 and the Euler f Function are combined in Theorem 6-3, which describes the
multiplicative structure of finite fields, Wicker (1995, p. 36).

Theorem 6-3: (The Multiplicative Structure of Galois Fields)


Consider the Galois field GF(q).
1. If t does not divide (q – 1), then there are no elements of order t in GF(q)
2. If t | (q – 1), then there are f (t) elements of order t in GF(q)

Important results related to the elements of order (q – 1) in GF(q) rest on Theorem 6-3.

Definition 6-2: An element with order (q – 1) in GF(q) is called a primitive element in GF(q).

As an immediate consequence of Theorem 6-3, it follows that in every finite field GF(q) there
are exactly f (q – 1) primitive elements. The notion of primitive elements is central to the
analysis of certain powerful classes of codes, as will be demonstrated in Chapter 8.

The fact that f (t) is always greater than zero for positive t, combined with the above
corollary, dictates that every finite field GF(q) contains at least one primitive element.

Consider the following sequence that comprises successive powers of a primitive element a in
GF(q)

1, a, a 2 , …, a q-2 , a q-1 , a q ,…

Since a is a primitive element, ord(a) = q – 1 and thus a q-1 is the first power of a that repeats
the value 1. It can be shown that 1 is the first element to repeat, in an analogous way to the
one developed on the order of a field element. Consequently, the first (q – 1) elements in the
above sequence are distinct and repetitions start from (q – 1) and higher powers of a. Since
the powers of a are non-zero, the first (q – 1) elements in the sequence must comprise the

9
Integers that have no common divisors with t except 1.

41
Mathematical Background II

(q – 1) non-zero elements in GF(q). Thus, all non-zero elements in GF(q) can be represented
as (q – 1) consecutive powers of a primitive element a ∈ GF(q).
Consider the following sequence of sums of the multiplicative identity element 1 in a finite
field GF(q)
1 2 3 k

∑ 1 = 1, ∑ 1 = 1 + 1, ∑ 1 = 1 + 1 + 1, …, ∑ 1 = 11+42
1 + ...
4+31 , …
i =1 i =1 i =1 i =1 k −times

Closure of the field GF(q) under addition implies that these sums are elements in the field.
Since GF(q) has a finite number of elements, these sums cannot all be distinct and thus, the
sequence must begin to repeat at some point. That is, there must exist two positive integers j
and k (j < k) such that
j k

∑1 =
i =1
∑1
i =1

k− j λ
Thus, ∑ 1 = 0. Consequently, there must exist a smallest positive integer ? such that∑ 1 = 0.
i =1 i =1
This integer ? is called the characteristic of GF(q). Further, the following theorem, Lin and
Costello (1983, p. 22), determines ? to be a prime integer.

Theorem 6-4: The characteristic ? of a finite field is prime.

As mentioned earlier, a Galois field GF(p) can be constructed by reducing the set of integers
modulo p, where p is a prime number. The following theorem, Berlekamp (1968, p. 102), in
combination with Theorem 6-4, extends the available range of finite fields by imposing that
the order of a Galois field may also be a power of a prime.

Theorem 6-5: The order of a finite field is a power of its characteristic.

6.1.2 Primitive Polynomials


Groups were obtained by applying a single binary operation and certain restrictions to a set of
elements. The introduction of a second binary operation results in the structures of rings and
fields.

Definition 6-3: A ring is a set of elements R with two binary operations ‘+’ and ‘·’ such that
the following requirements are satisfied:
1. R forms an additive commutative group
The additive identity element is ‘0’

42
Mathematical Background II

2. The operation ‘·’ is associative:


a·(b·c) = (a·b)·c for all a,b,c ∈ R
3. The operation ‘·’ distributes over ‘+’:
a·(b + c) = a·b + a·c
(b + c)·a = b·a + c·a for all a,b,c ∈ R

A ring is said to be a commutative ring if:


4. The operation ‘·’ commutes:
a·b = b·a for all a,b ∈ R

A ring is said to be a ring with identity if:


5. The operation ‘·’ has an identity element ‘1’, such that:
a·1 = 1·a = a for all a ∈ R

If a ring satisfies all five above requirements, it is said to be a commutative ring with identity .

Based on the definition of a ring, a field F can be considered as a ring R for which R –{0} is a
commutative group.

The collection of all polynomials a 0 + a 1 x + a2 x2 + … + a nxn of arbitrary degree with


coefficients in the finite field GF(p), denoted by GF(p)[x], forms a commutative ring with
identity. The additive and multiplicative operations are performed using standard polynomial
addition and multiplication as follows:

• additive operation:
(a 0 + a1 x + a2 x2 + … + a n xn ) + (b 0 + b 1x + b 2 x2 + … + b nxn ) =
= (a 0 + b0 ) + (a 1 + b 1 )x + … + (a n +bn )xn

• multiplicative operation:
(a 0 + a1 x + a2 x2 + … + a n xn ) · (b 0 + b1 x + b2 x2 + … + bm xm) =
= (a 0b 0 ) + [(a 0 b1 ) + (a 1b 0)]x + … + a nb mxn+m

Note that the coefficient operations are performed using the operations for the field from
which they were taken. For example, in GF(2) addition modulo 2 dictates that 1 + 1 = 0.

A polynomial f(x) is irreducible in GF(p)[x], if f(x) cannot be factored into a product of lower
degree polynomials. The notion of irreducible polynomial for polynomials is, in a way,

43
Mathematical Background II

analogous to that of a prime number p for numbers. Yet, a polynomial f(x) may be irreducible
in one ring and reducible in another. For this reason, the notion is used with respect to a
specific ring of polynomials.

An irreducible polynomial that satisfies the property indicated by the following definition is
central for constructing a non-prime order Galois field and is called primitive polynomial.

Definition 6-4: An irreducible polynomial p(x) ∈ GF(p)[x] of degree m, is called primitive if


the smallest positive integer n for which p(x) divides xn– 1 is n = p m– 1.

For example, in GF(2)[x] the polynomial x3 + x + 1 is primitive (m = 3, p = 2) since the


smallest degree polynomial of the form xn – 1 for which x3 + x + 1 is a divisor, is x7 – 1 and
7 = 23 – 1. On contrast, the polynomial x4 + x3 + x2 + x + 1 is not primitive in GF(2)[x] because
the smallest degree polynomial of the form xn – 1 for which x4 + x3 + x2 + x + 1 is a divisor, is
x5 – 1 and 5 ≠ 24 – 1. The polynomial x4 + x3 + x2 + x + 1 is irreducible in GF(2)[x] but not
primitive.

As demonstrated in the above example, an irreducible polynomial in GF(p)[x] is not always


primitive. However, a primitive polynomial in GF(p)[x] is always irreducible in GF(p)[x]
according to Definition 6-4.

With regard to the roots of primitive polynomials in GF(p)[x], which can be found in the
finite field of order p m , Theorem 6-6 states that they are of order (p m– 1), Wicker (1995,
p. 41).

Theorem 6-6: The roots {ai } of an m-th degree primitive polynomial in GF(p)[x] have
order (p m– 1).

It can also be shown that all of the roots of an irreducible polynomial have the same order.

6.1.3 Finite Fields of Order p m


The following analysis aims to demonstrate the construction of a finite field with p m elements,
GF(p m) from the field GF(p). Such a field contains the roots of primitive polynomials over
GF(p) – which cannot be found in GF(p) – as illustrated in the following discussion.

If p(x) = xm + a m-1xm-1 + … + a1 x + a0 is a primitive, and in general irreducible, polynomial of


degree m over a finite field GF(p) and p(x) is not linear, then it can have no root in GF(p).

44
Mathematical Background II

However, we can extend GF(p) to a larger field in which p(x) has a root, by a method similar
to that which constructs the complex numbers from the reals.

We adjoin a new symbol a to GF(p) and form all formal sums a 0 + a 1 a + … + a m-1 a m-1 of
degree less than m = deg(p(x)).

These may be added and multiplied like polynomials except that the multiplication is
performed modulo p(a). This requirement ensures that the set of formal sums is closed under
the above operations. It may be shown that the resulting set with these operations is a finite
field containing GF(p). Furthermore, if a is a root of p(x) then

p(a) = a0 + a 1 a + … + a m-1a m-1 + a m = 0

hence, a m = –a 0 –a1 a – … –a m-1 am-1

Since a is of order (p m– 1), by Theorem 6-6, the (p m– 1) distinct powers of a must have
(p m– 1) corresponding non-zero formal sums of the form

b 0 + b 1a + … + bm-1 a m-1

The coefficients {b i } are taken from GF(p), so there are exactly (p m – 1) distinct non-zero
formal sums or polynomial representations for the (p m– 1) powers of a.

It can be shown that, these (pm – 1) polynomials, together with zero, form an additive group
under polynomial addition. It can also be shown that they form a multiplicative group under
polynomial multiplication performed modulo p(a) and for the operations thus defined,
multiplication distributes over addition. As a result, the (p m – 1) polynomial representations
together with zero form a finite field of order p m , GF(pm ).

The non-zero elements of this finite field can be represented as (p m– 1) consecutive powers of
a or polynomials in a of degree less than m with coefficients in GF(p). We return to the more
general topic of rings of polynomials in sections 6.2.4 and 7.2, where their significance in the
construction of cyclic codes shall be explained.

An application of the above concepts, leading to the construction of GF(8), is demonstrated in


the following example.

45
Mathematical Background II

Consider the commutative ring with identity GF(p)[x]. The polynomial p(x) = x3 + x + 1 is a
primitive polynomial in GF(2)[x]. Let a be a root of p(x). Then,

a3 + a + 1 = 0 ⇒ a3 = a + 1

Also, m = 3 since p(x) is of degree 3. The mapping between the distinct powers of a and the
polynomials in a of degree at most m – 1 = 2 has as follows.

0 0

a0 1

a1 a

a2 a2

a3 a+1 since, a 3 = a + 1

a4 a2 + a since, a 4 = a 3 a = (a + 1)a = a 2 + a

a5 a2 + a + 1 since, a 5 = a 4a = (a 2 + a)a = a 3 + a 2 = (a + 1) + a 2

a6 a2 + 1 since, a 6 = a 5a = (a 2 + a + 1)a = a 3 + a 2 + a =
= (a + 1) + a 2 + a = a 2 + 1

As depicted in the above mapping between the exponential (first column) and polynomial
(second column) representations, the 7 distinct powers of a have 7 distinct representations of
polynomials in a of degree less than 3.

Another useful representation for the field elements in GF(p m ) is the vector representation. If
b 0 + b1 a + … + bm-1 am-1 is the polynomial representation of an element ß ∈ GF(p m ), then ß can
be represented by the vector whose m coordinates are the m coefficients of the polynomial
representation of ß. Hence,

b 0 + b 1a + … + bm-1 a m-1 ↔ (b 0 , b1 , …, b m-1 )

The zero element is represented by the all-zero vector (0, 0, …, 0) of dimension m. In this
manner, each distinct power of a or equivalently each field element in GF(p m ) is associated

46
Mathematical Background II

with a vector of dimension m with coordinates in GF(p). This representation allows for the
operation of addition in GF(p m ) to be reduced to vector addition over GF(p). Clearly, GF(p m )
forms a vector space over GF(p).

With regard to the previous example, a vector space representation for GF(8) can be obtained
by using the set {1, a, a 2 } as a basis. Since all the non-zero elements of GF(8) can be
represented as polynomials in a of degree m – 1 = 2 or less, by taking their vector
representation the operations in GF(8) are reduced to vector operations. Each distinct power
of a is associated with a vector of dimension 3 as:

0 ↔ (0, 0, 0)

a ↔ (0, 1, 0)

a2 ↔ (0, 0, 1)

a3 = a+1 ↔ (1, 1, 0)

a4 = a2 + a ↔ (0, 1, 1)

a5 = a2 + a + 1 ↔ (1, 1, 1)

a6 = a2 + 1 ↔ (1, 0, 1)

a7 = 1 ↔ (1, 0, 0)

The above vector representation, allows for the operation of addition in the field GF(23 ) to be
reduced to vector addition over the field GF(2).

In general, a finite field GF(p m ), constructed using an m-th degree polynomial in GF(p)[x],
contains GF(p) and can be viewed as a construction over GF(p). Consequently, fields of
prime order power, GF(p m ), are called extensions of the prime order field, GF(p), which is
referred to as the ground field of GF(p m ).

Just as groups may contain more than one subgroups, so may a finite field contain subfields
other than GF(p), p a prime. In fact, GF(p m ) contains all Galois fields of order p b where b
divides p. For example, the field GF(64) contains GF(26 ), GF(23 ), GF(22 ) and GF(2), all
being proper subfields except for GF(26 ).

47
Mathematical Background II

6.2 Polynomials over Galois Fields


6.2.1 Euclid’s Algorithm
As mentioned earlier, the set of all polynomials with coefficients taken from the field GF(q),
denoted by GF(q)[x], forms a commutative ring with identity. Through the definition of a
certain function g and the introduction of the cancellation property, rings of polynomials form
Euclidean domains.

Definition 6-5: A Euclidean domain is a set D with two operations ‘+’ and ‘·’ such that:
1. D forms a commutative ring with identity under ‘+’
2. Cancellation: if a·b = b·c, b ≠ 0, then a = c
3. There exists a function g: D –{0} → N such that:
i. g(a) < g(a·b) for every non-zero b ∈ D
ii. for all non-zero a,b ∈ D with g(a) > g(b), there exists q and r
such that, a = q·b + r where r = 0 or g(r) < g(b)
(q is called quotient and r remainder)

Note that for the additive identity element, the value g(0) is taken by convention to be − ∞ .
The ring of polynomials f(x) over a finite field GF(q) with the function g defined as
g(f(x)) = deg(f(x)) forms a Euclidean domain. Put more simply, with each polynomial in
GF(q)[x] we associate an integer which is the degree of the polynomial. For example,
f(x) = x3 + x + 1 ∈ GF(2)[x] is associated with the integer 3, which equals to deg(f(x)).

The introduction of the function g as defined above, allows the operation of division in a
Euclidean domain.

For a and b in a Euclidean domain D, a is said to be a divisor of b, denoted by a | b, if there


exists c ∈ D such that a·c = b. An element a, which may be a polynomial, is said to be a
common divisor of a collection of elements {b 1 , b 2 , …, b n } if a | bi for i = 1..n.

A general procedure for finding the greatest common divisor of any two polynomials, also
valid for integers, is the Euclid’s algorithm. The Euclid’s algorithm for polynomials in a
Euclidean domain D is as follows.

Given a(x) and b(x) ≠ 0, there exist polynomials s(x) and d(x) such that

a(x)·s(x) + b(x)·d(x) = GCD(a(x), b(x))

48
Mathematical Background II

The process consists of dividing appropriate polynomials until a remainder of 0 results. The
following algorithm determines these appropriate polynomials.

Euclid’s Algorithm

Step 1: (initialisation)
Let r-1 (x) = a(x)
and r0(x) = b(x)

Step 2: (recursion formula)


If ri-1 (x) ≠ 0 then
ri (x) = ri-2 (x) – ri-1 (x)·si (x), where deg(ri (x)) < deg(ri-1 (x))

Step 3: If ri (x) = 0 then


ri-1 (x) = GCD(a(x), b(x))
else, repeat Step 2 (for i = i + 1)

Note that with each iteration of the recursion formula, the degree of ri (x) gets smaller, since it
is strictly less than the degree of the dividend (which is ri-1 (x)). Thus, the algorithm
terminates after a finite number of steps.

The polynomials s(x) and d(x) can be obtained by applying the process described in Euclid’s
algorithm inversely. Supposing that GCD(a(x), b(x)) = r2 (x) was found after two iterations, it
follows from the recursion formula that

r2 (x) = r0 (x) – r1 (x)·s2 (x)


= r0 (x) – (r-1 (x) – r0 (x)·s1 (x))·s2 (x)
= r0 (x)·(1 + s1 (x)·s2 (x)) + r-1(x)·(–s2 (x))

Hence, s(x) = –s2(x)


and d(x) = 1 + s1 (x)·s2 (x)
(recall that from Step 1, r-1 (x) = a(x) and r0 (x) = b(x))

Consequently, the linear combination a(x)·s(x) + b(x)·d(x) = r2 (x) can be determined, which
gives the GCD(a(x), b(x)).

49
Mathematical Background II

6.2.2 Minimal Polynomials


Polynomials over a finite field GF(q) that have a set of roots selected from GF(qm ) are central
to the understanding of important classes of error control codes. In addition, these
polynomials need to have coefficients taken from the subfield GF(q). The main issue to be
addressed is the following. If a polynomial p(x) with coefficients in GF(q) is required to have
roots from the field GF(q m ), what are the other roots this polynomial must have.

Definition 6-6: Let a be an element in the field GF(q m ). The minimal polynomial of a with
respect to GF(q) is the smallest-degree monic 7 (and thus non-zero) polynomial
p(x) in GF(q)[x] such that p(a) = 0.

In other words, the minimal polynomial p(x)∈ GF(q) for a is the smallest-degree non-zero
polynomial that contains the specified root a.

Properties of the minimal polynomial are stated in the following theorem, Wicker (1995,
p. 54).

Theorem 6-7: For each element a in GF(q m ) there exists a unique monic (and thus non-zero)
polynomial p(x) of minimal degree in GF(q)[x] such that the following are
true:
1. p(a) = 0
2. The degree of p(x) is less than or equal to m
3. f(a) = 0 implies that f(x) is a multiple of p(x)
4. p(x) is irreducible in GF(q)[x]

Consequently, there exist polynomials with coefficients taken from the field GF(q), which
have a specified set of roots – the elements of GF(q m ). Those polynomials are the minimal
polynomials for each element a in GF(q m ). The next theorem, Wicker (1995, p. 56),
determines the other roots of the minimal polynomial of a as the conjugates of a with respect
to GF(q). These are defined as follows.

Let a be an element in the Galois field GF(q m ). The conjugates of a with respect to GF(q) are
the elements

7
A polynomial f(x) is said to be monic if the coefficient of the highest power of x in f(x) is equal to 1.

50
Mathematical Background II

2 3
a, a q , a q , a q ,…
The conjugacy class of a with respect to GF(q) is the set of conjugates of a with respect to
GF(q). It can be shown that the conjugacy class of a ∈ GF(q m ) with respect to GF(q) contains
d
d elements where d is the smallest integer such that a q = a.

Theorem 6-8: Let a be an element in GF(q m ). Let p(x) be the minimal polynomial of a with
respect to GF(q). The roots of p(x) are exactly the conjugates of a with respect
to GF(q).

Furthermore, the conjugates of an element a ∈ GF(qm ) can be used to obtain a form of the
minimal polynomial according to the following theorem, Lin and Costello (1983, p. 37).

Theorem 6-9: Let p(x) be the minimal polynomial of an element a in GF(q m ). Let d be the
d
smallest integer such that a q = a. Then,

d −1
p(x) = ∏ ( x + a q )
i

i =0

The minimal polynomials are of significant importance for the complete factorisation of the
polynomial of the form xn – 1, which is essential to describe certain classes of error detecting /
correcting codes and is discussed in the following section.

6.2.3 Factorisation of x n – 1
The order of any element a in the field GF(q m ) divides (q m – 1), according to Theorem 6-1.
By definition of order (Definition 6-1), it follows that the non-zero elements in GF(q m ) are
m
−1
roots of the expression x q – 1 = 0 or equivalently, they are (q m– 1)-st roots of unity. Since
m
−1
the expression x q – 1 = 0 is of degree (q m– 1), it may be shown that it has exactly (q m – 1)
distinct roots. Consequently, the (q m – 1) non-zero elements of GF(q m ) form the complete set
m
−1 m
−1
of roots for x q – 1 = 0. Since every non-zero element in GF(q m ) is a root of x q – 1 = 0,
the minimal polynomials of all the non-zero elements in GF(q m ) provide the complete
m
−1
factorisation of x q – 1. Further, those factors are irreducib le polynomials in the ring
GF(q)[x].

m
−1
The reasoning developed above for factoring the expression x q – 1, can be extended to the
more general form of polynomials xn – 1. Assume that there exists element ß with order n in
some field GF(q m ). It follows that ß and all powers of ß are roots of xn– 1 = 0. In addition,

51
Mathematical Background II

the elements 1, ß, ß2 , …, ßn-1 are distinct (repetitions begin from ßn and the first element to
repeat is element 1). Thus, the n roots of xn – 1 are generated by computing n consecutive
powers of ß. For this reason, elements of order n like ß are called primitive n-th roots of
unity.

If n is a divisor of (q m – 1), then there are exactly f (n) elements with order n in GF(q m ).
Therefore, the existence of a positive integer m such that n | (q m – 1) implies the existence of a
primitive n-th root of unity ß in an extension field GF(q m ) of GF(q). Moreover, if m is chosen
to be the smallest positive integer such that n | (q m– 1), ß can be found in the smallest
extension field GF(q m ) of GF(q). Once the desired primitive n-th root of unity has been
found, forming the conjugacy class of that root and computing the associated minimal
polynomials, of the root and its conjugates, can complete factorisation of xn – 1. A discussion
about the existence of such an element ß and where it can be found is included in Appendix
B.

In general, factoring xn – 1 is a quite tedious task. Due to its application to the construction of
cyclic codes which will be described in Chapter 7, algebraic computer systems like MAPLE
have been developed, capable of factoring polynomials of reasonable degree, over some
prime order fields.

6.2.4 Ideals
Rings of polynomials are of great significance in the study of algebraic codes since they are
used for defining powerful classes of error control codes. It has already been seen that the
collection of all polynomials of arbitrary degree with coefficients in the finite field GF(q)
forms a commutative ring with identity. If the ring of polynomials GF(q)[x] is reduced
modulo (xn – 1), the resulting structure is again a ring, this time containing all polynomials of
degree less than n with coefficients in GF(q), denoted by GF(q)[x]/(xn– 1). This ring and in
particular, the notion of ideals in GF(q)[x]/(xn– 1), is rather useful in defining linear cyclic
codes as will be described in section 7.2.

Definition 6-7: Let R be a ring. A non-empty subset I ⊂ R is said to be an ideal if it satisfies


the following:
1. I forms a subgroup of the additive group of R
2. If a ∈ I, then a·r = b ∈ I, for all r ∈ R

It can be seen that {0} and R are the trivial ideals in any ring R.

52
Mathematical Background II

Definition 6-8: An ideal I contained in a ring R is said to be principal if there exists g ∈ I


such that every element c∈ I can be expressed as the product m·g, for some
m∈ R.

As dictated by the above definition, every element in the principal ideal can be represented as
a product of a specific element in the ideal. This element g, used to represent all elements of
the principal ideal, is called generator element. The ideal thus generated is denoted by <g>.
Basic properties of ideals in the ring GF(q)[x]/(xn– 1) are presented in the following theorem,
Wicker (1995, p. 64).

Theorem 6-10: Let I be an ideal in GF(q)[x]/(xn– 1). The following is true:


1. There exists a unique monic polynomial g(x) ∈ I of minimal degree
2. I is principal with generator g(x)
3. g(x) divides xn– 1 in GF(q)[x]

For example, the ring GF(2)[x]/(x7 – 1) contains the ideals: <x + 1>, <x3 + x + 1>, <x3 + x2 + 1>,
<(x + 1)(x3 + x + 1)>, <(x + 1)(x3 + x2 + 1)>, <(x3 + x + 1)(x3 + x2 + 1)> and the two trivial ideals
{0} and R.

The above theorem, in combination with the reasoning developed in factorising xn – 1, leads to
the characterisation of all the ideals in GF(q)[x]/(xn– 1), which is central to the construction of
the class of cyclic codes, as described in the following Chapter.

53
Cyclic Codes

7. CYCLIC CODES

Cyclic codes are an important class of linear block codes. Much work has been done on a
theoretical basis on cyclic codes, which enhances their practical applications. Their
considerable algebraic structure enables multiple error correction and can provide protection
against error bursts. Many important codes such as Golay, Hamming, BCH codes can be
represented as cyclic codes. A cyclic version of the [7,4] Hamming code is included in
Appendix C. The underlying mathematical structure of cyclic codes allows for the design of
various encoding / decoding methods which are implemented by means of shift registers.

7.1 Polynomial Representation


The right (left) cyclic shift of an n-tuple v = (v0 v1 …vn-1 ) is the n-tuple v´= (vn-1v0 v1 …vn-2 )
obtained by shifting each component to the right (left) one position, wrapping around the last
(first) component to the first (last) position.

Definition 7-1: An (n,k) linear code C is called a cyclic code if every cyclic shift of a
codeword in C is also a codeword in C.

When considering cyclic codes, it is useful to associate each codeword c = (c0 , c1 , …, cn-1 ) of
length n, with a polynomial c(x) of degree at most (n – 1), which has the coordinates of the
code vector as its coefficients. Thus,

c(x) = c0 + c1 x + c2x2 + … + cn-1xn-1 ↔ c = (c0 , c1 , c2 , …, cn-1 )

The polynomial c(x) is often referred to as code polynomial.

The above polynomial representation of a codeword, and consequently of a code C, is central


to the analysis of the underlying structure of cyclic codes since now codewords can be treated
as polynomials. This allows us to exploit algebraic properties of polynomials over Galois
fields, described in Chapter 6, in defining families of error detecting / correcting codes. If C
is a q-ary (n,k) code, then the collection of codewords in C forms a vector subspace of
dimension k, within the space of all n-tuples over GF(q). It follows that the code polynomials
associated with the codewords of C also form a vector subspace, this time within
GF(q)[x]/(xn– 1). This structure is the ring of polynomials with coeffic ients in GF(q) where
the operations are performed modulo (xn– 1), as mentioned in section 6.2.4.

54
Cyclic Codes

7.2 Cyclic Codes as Ideals


The notion of cyclic shift is now applied to polynomials in GF(q)[x]/(xn– 1). The k cyclic
shifts of a codeword are equivalent to the multiplication modulo (xn– 1) of the corresponding
code polynomial by xk. This is justified as follows. If c´ is the cyclic shift of c ∈ C then its
corresponding polynomial is c´(x) = x·c(x) mod(xn– 1)∈ C. Indeed,

x·c(x) mod(xn– 1) = (c0x + c1 x2 + … + cn-1 xn ) mod(xn – 1)


= (cn-1 + c0 x + … + cn-2xn-1)
= c´(x)
Hence,
2 right cyclic shifts ⇒ x2 ·c(x) mod(xn– 1)
k right cyclic shifts ⇒ xk·c(x) mod(xn – 1)

The product a(x)·c(x) where c(x) is a code polynomial and a(x) an arbitrary polynomial in
GF(q)[x]/(xn– 1), is a linear combination of cyclic shifts. By Definition 7-1, and bearing the
polynomial representation in mind, a(x)·c(x) must also be a code polynomial. Thus, code C
which forms a vector space within the ring of polynomials GF(q)[x]/(xn– 1) is an ideal, recall
Definition 6-7, and this is formally put in the following theorem, Poli and Huguet (1992,
p. 188).

Theorem 7-1: An (n,k) linear code C is cyclic if and only if C is an ideal of


Rn [x] = GF(q)[x]/(xn– 1).

According to Theorem 6-10, every cyclic code is a principal ideal of the ring GF(q)[x]/(xn – 1).
This implies that a cyclic code C consists of the multiples of a polynomial g(x) ∈ C, which is
unique, monic and of lowest degree among all code polynomials. This polynomial is called
the generator polynomial of the q-ary (n,k) cyclic code. Further, g(x) is a divisor of xn – 1, for
otherwise the greatest common divisor (GCD) of xn– 1 and g(x) would be a polynomial in C
of lower degree than g(x).

The above results combined with Theorem 6-10 are summarised in the following theorem,
Wicker (1995, p. 101), which presents the basic properties of cyclic codes.

Theorem 7-2: Let C be a q-ary (n,k) linear cyclic code.


1. Within the set of code polynomials in C there is a unique monic polynomial
g(x) with minimal degree r < n. g(x) is called the generator polynomial of C

55
Cyclic Codes

2. Every code polynomial c(x) in C can be expressed uniquely as


c(x) = m(x)·g(x), where g(x) is the generator polynomial of C and m(x) is a
polynomial of degree less than (n – r) in GF(q)[x]/(xn– 1)
3. The generator polynomial g(x) of C is a factor of xn– 1 in GF(q)[x]

The requirement that the generator polynomial must be a divisor of xn – 1 limits the selection
of g(x). Based on the factorisation of xn – 1 into irreducible polynomials in GF(q)[x], it is
possible to list all q-ary cyclic codes of length n. Let xn– 1 be factorised into irreducible
factors, as xn– 1 = f 1 (x) f2 (x)…f t (x). By choosing, in all possible ways, one of the 2t factors of
xn – 1 as generator polynomial g(x) and defining the corresponding code to be the set of
multiples of g(x) modulo (xn – 1), all cyclic codes of length n can be determined.

7.3 Parity-Check Polynomial – Generator Matrix for Cyclic Codes


Let g(x) be the generator polynomial of a cyclic code C of length n. If g(x) is of degree
(n – k), g(x) = g 0 + g 1x + … + g n-kxn-k, C has dimension k. Since all code polynomials are
multiples of the generator polynomial, an information sequence i = (a 0 , a 1 , …, a k-1 ) with
corresponding information polynomial i(x) = a 0 + a1 x + … + a k-1xk-1 , is encoded by
multiplying i(x) by g(x). Hence, the encoded word associated with the code polynomial

c(x) = c0 + c1 x + … + cn-1xn-1

will be produced as
c(x) = i(x)·g(x)
= (a0 + a 1x + … + a k-1 xk-1 )·g(x)
= a 0g(x) + a 1xg(x) + … + a k-1 xk-1g(x)

and in matrix multiplication

 g ( x) 
 
 xg( x) 
c(x) = [a0 a 1 … a k-1 ] ·  . 
 
 . 
 . 
 
 x k −1 g ( x) 

This provides a general form for the generator matrix G for cyclic codes

56
Cyclic Codes

g 0 g 1 g 2 . . . g n-k 0 0 . . . 0
0 g0 g1 g2 . . . g n-k 0 . . . 0
. . . . .
G = . . . . .
. . . . .
0 . . . 0 g0 g1 g2 . . . g n-k

Note that the k rows of the generator matrix G are the codewords g(x), xg(x), …, xk-1 g(x)
which form a basis for C.

Since the generator polynomial is a divisor of xn – 1, by Theorem 7-2, there is a polynomial of


degree k, h(x) = h 0 + h1 x + … + h kxk such that g(x)·h(x) = xn – 1. This implies that in the ring
GF(q)[x]/(xn– 1), g(x)·h(x) = 0 mod(xn– 1). It follows that c(x) is a code polynomial if and
only if c(x)·h(x) = 0 modulo (xn– 1), in an equivalent statement of c(x) being a code
polynomial if and only if c(x) is a multiple of g(x), as mentioned earlier.

Let c(x) be a code polynomial in C. Then, c(x) = m(x)·g(x). If c(x) is multiplied by h(x),

c(x)·h(x) = m(x)·g(x)·h(x)
= m(x)·(xn – 1)
= xn m(x) – m(x) (7.3.1)

The degree of m(x) is at most (k – 1) and thus the powers xk, xk+1 , …, xn-1 do not appear in
xn m(x) – m(x). It follows that on the left-hand side of equation (7.3.1), the coefficients of the
powers xk, xk+1 , …, xn-1 must be equal to zero, providing (n – k) parity-check equations

∑hc i n− i− j =0 for 1 ≤ j ≤ n – k (7.3.2)


i= 0

By taking the reciprocal13 of h(x),

xkh(x-1 ) = h k + h k-1 x + … + h0 xk

8
Let f(x) = a 0 + a 1 x + … + a n xn be an n-th degree polynomial. The reciprocal f* (x) is the polynomial
f* (x) = xn f(x-1 ) = a n + a n-1 x + … + a 0 xn .

57
Cyclic Codes

which can be shown that is also a factor of xn – 1, an (n,n – k) cyclic code is generated with the
following (n – k)x n matrix as a generator matrix

h k h k-1 . . . h0 0 0 . . . 0
0 hk . . . h1 h0 0 . . . 0
. . .
H = . . .
. . .
0 0 . . . hk h k-1 . . . h0

The construction of H, based on the (n – k) parity-check equations (7.3.2), implies that any
codeword v in C is orthogonal to the (n – k) rows of H. It follows that the rows of H are
vectors in the dual space C- of C. Since h(x) is monic, the (n – k) rows of H are linearly
independent and according to Theorem 3-4 and the reasoning developed in section 4.3.2, they
span C- . Thus, H is a parity-check matrix for C. The polynomial h(x), used to obtain H, is
called the parity -check polynomial of cyclic code C, Pless (1998, p. 74).

Theorem 7-3: If g(x)·h(x) = xn – 1 in GF(q)[x]/(xn – 1) and g(x) is the generator polynomial of


a code C, then the reciprocal polynomial of h(x) is the generator polynomial of
C- . Furthermore, if h(x) = h 0 + h1 x + … + h kxk, then the following is a
parity-check matrix H of C

h k h k-1 . . . h0 0 0 . . . 0
0 hk . . . h1 h0 0 . . . 0
. . .
H = . . .
. . .
0 0 . . . hk h k-1 . . . h0

Hence, C ⊥ is also cyclic.

Indeed, the parity-check matrix and the generator matrix for a cyclic code C share the same
structure. The reciprocal polynomial of h(x) shall construct a generator matrix for C- .

58
Cyclic Codes

7.4 Systematic Encoding for Cyclic Codes


Consider an (n,k) cyclic code C with generator polynomial g(x). With regard to the encoding
process, substantial design benefit can be obtained through having the information bits
occupy the last k positions of the encoded word.

Given an information sequence i = (a 0 , a 1 , …, a k-1 ), by multiplying its corresponding


information polynomial i(x) = a0 + a 1 x + … + a k-1xk-1 by xn-k, the resulting polynomial is

xn-ki(x) = a0 xn-k + a 1 xn-k+1 + … + a k-1 xn-1

which is associated with the sequence

(0, 0, …, 0, a 0 , a 1 , …, a k-1 )

whose first (n – k) positions are zero.

Now, if xn-ki(x) is divided by g(x), recall Euclid’s algorithm,

xn-ki(x) = q(x)·g(x) + d(x), where deg(d(x)) < deg(g(x))

Hence,
q(x)·g(x) = xn-ki(x) – d(x) (7.4.1)

Since the product q(x)·g(x) = c(x) is a multiple of g(x), it is a valid code polynomial and so is
xn-ki(x) – d(x). The remainder d(x) is of degree less than (n – k) and thus can be associated
with the sequence

( – d 0 , – d 1 , …, – dn-k-1 , 0, 0, …, 0)

whose last k positions are zero. Thus, using expression (7.4.1), the codeword c(x) = q(x)·g(x)
can be written as

c(x) = q(x)·g(x) = [xn-ki(x) – d(x)] ↔ c(x) = (– d 0 , – d1 , …, – d n-k-1 , a0 , a 1 , …, a k-1 )

In this way, the information polynomial i(x) has been systematically encoded. That is, the
information sequence, associated with i(x), has been mapped to a codeword in which the
information bits occupy the last k positions.

59
Cyclic Codes

The process developed above can be described in three steps.

Systematic encoding algorithm

Step 1: Multiply the information polynomial i(x) by xn-k

Step 2: Divide the result by g(x)


Let d(x) be the remainder

Step 3: Set c(x) = xn-ki(x) – d(x) and output c(x)

60
BCH Codes – Reed-Solomon Codes

8. BCH CODES – REED-SOLOMON CODES

In constructing cyclic codes there is no guarantee on the minimum distance of the resulting
code. Given an arbitrary generator polynomial g(x), a computer search of all non-zero
codewords needs to be conducted to determine the minimum weight and thus, the minimum
distance. Placing a constraint on the generator polynomial in order to ensure the minimum
distance of the resulting code, is the main concept of another powerful class of codes, the so
called BCH codes, in honour of their discoverers Bose, Chaudhuri and Hocquenghem. In
addition to these codes, the Reed-Solomon codes, which address the issue of minimising the
added redundancy, are presented in the following discussion.

8.1 BCH Codes


Consider a q-ary (n,k) cyclic code C with generator polynomial g(x) ∈ GF(q)[x] which shall be
specified in terms of its roots from the Galois field GF(q m ). Let ß be a primitive n-th root of
unity, which can be found in the smallest extension field GF(q m ) of GF(q), according to the
reasoning developed in section 6.2.3. The generator polynomial is selected to be the minimal
degree polynomial in GF(q)[x] that has (d – 1) consecutive powers of ß as roots, where the
selection of the positive integer d determines the minimum distance of the resulting code as
will be described in Theorem 8-1. It follows from Theorem 6-8 that the conjugates of those
powers of ß are also roots of g(x). Since all (d – 1) consecutive powers of ß are roots of the
generator polynomial, g(x) can be considered as the least common multiple (l.c.m.) of the
minimal polynomials of these powers of ß. A generator polynomial thus selected, constructs
a BCH code, which can be defined as follows, van Lint (1999, p. 91).

Definition 8-1: A cyclic code of length n over GF(q) is called a BCH code of designated
distance d, if its generator polynomial g(x) is the least common multip le of the
minimal polynomials of ßl , ßl+1 , …, ßl+d -2 for some (non-negative integer) l,
where ß is a primitive n-th root of unity.

The codes defined for l = 1 are called narrow-sense BCH codes and for n = q m– 1 the
resulting codes are called primitive BCH codes, since the n-th root of unity ß is a primitive
element in GF(q m ).

The BCH code of designated distance d has minimum distance d, which is equal to or exceeds
d, as stated in the following theorem, Pless (1998, p. 112), which reveals the importance of
this class of codes.

61
BCH Codes – Reed-Solomon Codes

Theorem 8-1: (BCH Bound) The minimum weight of a BCH code of designated distance d is
at least d.

Now, a t-error-correcting BCH code of length n can be constructed for any positive integers m
and t that satisfy the requirement imposed by the following theorem, Poli and Huguet (1992,
p. 200).

Theorem 8-2: For every integer of the form qm – 1, m ≥ 3 (q a power of a prime), there exists
a BCH (n,k) code C which is t-error-correcting, such that k ≥ n – 2tm if q > 2
(k ≥ n – tm if q = 2) whose generator polynomial is

g(x) = [m1 (x), m2 (x), …, m2t-1(x)]

where mi (x) is the minimal polynomial of a i , a being a primitive element for


GF(q m).

The construction process of BCH codes can be described in three steps as follows.

Step 1: Find a primitive n-th root of unity in the smallest extension field GF(q m ) of GF(q)

Step 2: Select (d – 1) consecutive powers of ß

Step 3: Let g(x) be the l.c.m. of the minimal polynomials of these (d – 1) consecutive powers
of ß

It can be seen that we follow the general construction process for cyclic codes, but by placing
the constraint dictated by Steps 2 and 3 on the generator polynomial, we ensure that the
resulting code has minimum distance at least equal to d.

The importance of the class of BCH codes stems from the fact that a BCH code with desired
minimum distance d can be constructed, for a specific value of d that we select in the
construction process. Yet, the choice of d is not totally arbitrary since there are
implementation considerations.

8.1.1 Parity-Check Matrix for BCH Codes


Let v(x) = v0 + v1x + … + vn-1xn-1 be a polynomial in GF(q)[x] with coefficients in GF(q), that
has ßl , ßl+1 , …, ßl+d -2 as roots. By property 3 of Theorem 6-7, v(x) is divisible by the minimal

62
BCH Codes – Reed-Solomon Codes

polynomials of ßl , ßl+1 , …, ßl+d-2 . It follows that v(x) is also divisible by their least common
multiple, which is the generator polynomial g(x) of the t-error-correcting BCH code C.
Hence, v(x) is a code polynomial.

Since, the (d – 1) consecutive powers of ß starting from l are roots of v(x), the following
equations must be satisfied. These equations perform error detection, by checking whether a
received word v has the required zeros

v(ßl ) = 0 ⇒ v0 + v1 ßl + v2 ß2l + … + vn-1 ß(n-1)l = 0

v(ßl+1 ) = 0 ⇒ v0 + v1 ßl+1 + v2 ß2( l+1) + … + vn-1 ß(n-1)( l+1) = 0


.
. (8.1.2.1)
.
v(ß ) = 0 ⇒ v0 + v1 ßl+d-2 + v2 ß2(l+d-2) + … + vn-1 ß(n-1)(l+d -2) = 0
l+d-2

These equations can be expressed in terms of matrix multiplication

 1 βl β 2l . . . β ( n−1) l 
 
 1 β l+1 β 2 (l +1) . . . β ( n−1)( l+1) 
 
[ v0 v1 … vn-1 ]· 
. . . .
 = 0 (8.1.2.2)
 . . . . 
 . . . . 
 
 1 β l+ δ − 2 β 2(l +δ − 2) . . . β ( n−1)( l+ δ − 2) 

hence, vT·H = 0

It follows from equation (8.1.2.2) that if v = (v0 , v1 , …, vn-1 ) is a codeword in the t-error-
correcting BCH code C, then vT·H = 0. Conversely, if v satisfies vT·H = 0, then it follows
from equations (8.1.2.1) that the (d – 1) consecutive powers of ß, starting from l, are roots of
its corresponding polynomial v(x). This implies that v is a codeword in C. Hence, H is the
parity-check matrix of the BCH code C. Note that it has entries from the extension field
GF(q m) of GF(q), m minimal.

8.2 Reed-Solomon Codes


In defining BCH codes, the generator polynomial is required to be the least common multiple
of the minimal polynomials of each of the (d – 1) consecutive powers of a primitive n-th root
of unity ß. By Theorem 6-8, the conjugates of ß are also roots. As a result, BCH codes

63
BCH Codes – Reed-Solomon Codes

achieve a given error correction at the expense of adding more redundancy than actually
needed. Two finite fields are used for the construction of BCH codes. One is GF(q) over
which the code is defined, and the other is GF(q m ) where a primitive n-th root of unity can be
found.

Reed-Solomon codes are an important class of codes, where the two fields coincide. These
codes are also cyclic and can be defined as follows, Poli and Huguet (1992, p. 205).

Definition 8-2: A Reed-Solomon code (RS code) over GF(q m ) is a BCH code of length
n = q m– 1, dimension k = n – d + 1 and minimum distance d. It is therefore,
m
−1
an ideal in GF(q m )[x]/(x q – 1).

It can be seen that the length of the code is one less than the size of code symbols and
k = n – d + 1 implies that the designated distance d is d = n – k + 1, hence the minimum
distance of an RS code is one greater than the number of parity-check digits.

If a is a primitive element in GF(q m ), then a t-error-correcting Reed-Solomon code can be


constructed for d = 2t + 1, with generator polynomial

g(x) = (x + a l)(x + a l+1 ) … (x + a l+d -2 )

where l is a positive integer which is the power of the first of the (d – 1) consecutive powers
of a. Different generator polynomials are formed for different values of l.

Clearly, g(x) has (d – 1) consecutive powers of a as all its roots and has coefficients from the
field GF(q m ). The RS code thus generated, is an (n,n – 2t) cyclic code which consists of those
polynomials of degree at most (n – 1) with coefficients in GF(q m ) that are multiples of g(x).

The original approach to the construction of RS codes by their discoverers, I.S. Reed and
G. Solomon, is slightly different than the generator polynomial construction which initially
was developed to describe cyclic codes, as discussed in Chapter 7. A brief illustration of the
original approach to RS codes, based on Reed and Chen (1999, p. 243-4) is given next.

Consider a message block m = (m0 , m1 , …, mk-1 ) whose k symbols are taken from GF(q m ) with
corresponding message (information) polynomial i(x) = m0 + m1 x + … + mk -1 xk-1 . This
message block m is encoded by evaluating i(x) with each of the q m elements in the finite field

64
BCH Codes – Reed-Solomon Codes

GF(q m). Recall that the non-zero elements in GF(q m ) can be represented as the (qm – 1)
powers of some primitive element a. Thus, c is obtained as

m
−1
c = (c0 , c1 , …, cn-1 ) = [i(0), i(a), i(a 2 ), …, i(a q )]

In this way, a complete set of codewords can be determined by allowing the k information
symbols to take all possible values. This set of RS codewords forms a k-dimensional vector
space over GF(q m ). Besides, the code length is equal to q m since each codeword has q m
coordinates.

In contrast, following the generator polynomial approach construction, the resulting RS codes
have code length q m – 1. However, this approach is currently more popular than the original
approach, mainly because it is in accordance with the construction method adopted for cyclic
and BCH codes.

One of the most significant properties of Reed-Solomon codes is that an (n,k) RS code always
has minimum distance d equal to the designated distance d = n – k + 1, as dictated by
Definition 8-2. The fact that d is one greater than the number of parity-check digits, makes
the Reed-Solomon codes maximum-distance separable (MDS).

Definition 8-3: An (n,k) code with minimum distance d is said to be maximum-distance


separable (MDS) if d = n – k + 1.

The term was coined to describe codes whose minimum distance is the largest possible for
fixed length n and dimension k. This property satisfied by RS codes, mainly justifies the fact
that they are preferred in various practical applications for both random and burst error
correction. The latter, is briefly illustrated in section 8.4.

Encoding for Reed-Solomon codes is similar to encoding for cyclic codes described earlier
(section 7.4). To obtain the 2t parity-check digits, the information polynomial is multiplied
by x2t and then divided by the generator polynomial g(x). The coefficients of the remainder
are the parity-check digits. This process performs systematic encoding for Reed-Solomon
codes, since the parity-check digits occupy the first 2t positions of the corresponding
codeword.

65
BCH Codes – Reed-Solomon Codes

8.3 Decoding Non-binary BCH and Reed-Solomon Codes


The following demonstration of the decoding of non-binary BCH and Reed-Solomon codes is
based on Gorenstein’s and Zierler’s generalised Peterson decoding algorithm, as presented in
Wicker (1995, p. 214-7) and, Lin and Costello (1983, p. 151-5).

In decoding non-binary BCH and Reed-Solomon codes, not only the locations of the errors
need to be determined, but the corresponding error values as well. In the binary case, the
error values are equal to 1 and thus, known.

Suppose that codeword v(x) = v0 + v1x + … + vn-1xn-1 is transmitted. Errors that occurred
during transmission result in the received polynomial r(x) = r0 + r1 x + … + rn-1xn-1. Let e(x)
be the error pattern, e(x) = e0 + e1x + … + en-1 xn-1 . Then,

r(x) = v(x) + e(x) (8.3.1)

Selecting d – 1 = 2t, we ensure that code C can correct t errors, by Theorem 8-1 and Theorem
4-3. The first step is to compute the syndrome from the received vector r(x).

For a t-error-correcting BCH or Reed-Solomon code the syndrome is a 2t-tuple

S = (S1 , S 2 , …, S 2t ) = r·HT (8.3.2)

where H is the parity-check matrix (for l = 1, d – 1 = 2t).

The j-th component of the syndrome, by (8.3.2) and H taken from equation (8.1.2.2) is

S j = r(aj ) = r0 + r1 a + r2a 2j + … + rn-1a (n-1)j j = 1..2t (8.3.3)

The syndrome components are in the field GF(q m ) and can be computed from r(x) as follows.
We divide r(x) by the minimal polynomial p j (x) of aj and we obtain from Euclid’s algorithm
r(x) = a j(x)·pj (x) + bj (x), where deg(bj (x)) < deg(pj (x)). Since p j(a j ) = 0 we get

S j = r(aj ) = b j (a j ) j = 1..2t (8.3.4)

Thus, the syndrome component S j is obtained by evaluating b j (x) with a j .

Since a, a 2 , …, a2t are roots of every code polynomial, it follows that v(a j ) = 0, j = 1..2t.

66
BCH Codes – Reed-Solomon Codes

Consequently, by equations (8.3.1) and (8.3.3)

n −1
S j = e(a j ) = ∑ ek(aj )k (8.3.5)
k =0

Assuming that the received vector r has ? errors in positions i1 , i2 , …, i? where


0 ≤ i1 < i2 <…< i? < n, the syndrome sequence can be reexpressed in terms of these error

locations, { e il X l } l = 1..?
n −1 ?
S j = e(aj ) = ∑
k =0
ek(a j )k = ∑
l =1
e il X l j (8.3.6)

Expression (8.3.6) defines a series of 2t algebraic equations in 2? unknowns.

S 1 = e i1 X1 + e i2 X2 + … + eiν X?

S2 = e i1 X 12 + e i2 X 22 + … + eiν X ν2
.
. (8.3.7)
.
S 2t = e i1 X 12t + e i2 X 22t + … + eiν X ν2t

Equations (8.3.7) are not linear, but can be reduced to a set of linear functions in the unknown
quantities. For this purpose, an error locator polynomial ?(x) is defined, such that its roots
are the inverses of the error locators {Xl }

ν
?(x) = ∏ (1–Xl··x) = ? ?x? + ? ?-1x?-1 + … + ? 1x + ? 0 (8.3.8)
l =1

It follows that for some error locator Xl

?(X l−1 ) = ? ?X l−ν + ? ?-1 X l−ν +1 + … + ? 1 X l−1 + ? 0 = 0 (8.3.9)

Equation (8.3.9) can be multiplied through by e i l X l j , which is a constant

e il X lj (? ?X l−ν + ? ?-1 X l−ν +1 + … + ? 1 X l−1 + ? 0 ) = 0


hence,

e il (? ?X l−ν + j + ? ?-1 X l−ν + j +1 + … + ? 1 X lj −1 + ? 0 X lj ) = 0 (8.3.10)

67
BCH Codes – Reed-Solomon Codes

If equation (8.3.10) is summed over all indices l

∑ e il (? ?X l−ν + j + ? ?-1 X l−ν + j +1 + … + ? 1 X lj −1 + ? 0 X lj ) = 0


l =1

hence,

ν ν ν ν
? ? ∑ e il X l−ν + j + ? ?-1 ∑ e il X l−ν + j +1 + … + ? 1 ∑ e il X lj −1 + ? 0 ∑ e il X lj = 0
l =1 l =1 l =1 l =1

hence,

? ?S -?+j + ? ?-1S -?+j+1 + … + ? 1 Sj-1 + ? 0 Sj = 0 (8.3.11)

From equation (8.3.8) defining ?(x), it follows that ? 0 is always 1. Thus, equation (8.3.11)
can be reexpressed as

? ?S -?+j + ? ?-1S -?+j+1 + … + ? 1Sj-1 = –S j (8.3.12)

Assuming that ? = t, where t is the error correcting capability of the code, the following
matrix expression of (8.3.12) can be obtained

S1 S2 S3 S4 . . . S t−1 St Λt − S t +1
S2 S3 S4 S5 . . . St S t +1 Λ t+1 − S t+ 2
S3 S4 S5 S6 . . . S t+1 S t+ 2 Λ t+ 2 − S t+ 3
S4 S5 S6 S7 . . . S t+ 2 S t+ 3 Λ t +3 − S t+ 4
A´· ? = . · . = .
. . .
. . .
S t−1 St S t +1 S t +2 . . . S 2t− 3 S 2t − 2 Λ2 − S 2t −1
St S t +1 St +2 S t +3 . . . S 2t − 2 S 2t −1 Λ1 − S 2t

It can be shown that matrix A´ is non-singular if ? = t. If less than t errors occurred (? < t)
then it can also be shown that A´ is singular. If A´ is singular, then the rightmost column and
the last row are removed and the determinant of the resulting (t – 1)x (t – 1) matrix is
computed. The process is repeated until the resulting matrix becomes non-singular. Once
this is true, the coefficients {? l }, l = 1..t of the error locator polynomial are determined, using
standard linear algebra techniques, with computations performed in GF(q m ) from which the

68
BCH Codes – Reed-Solomon Codes

entries are taken. Once the ? error locations are determined, the equations in (8.3.7) form a
system of 2? equations in ? unknowns, { e il } l = 1..?. Solving this system for the error values

{ e il } completes the decoding process.

8.4 Burst Error Correction and Reed-Solomon Codes


One of the reasons for the widespread use of Reed-Solomon codes is that they are effective
for correcting multiple error bursts. Disturbances causing transmission errors to cluster in
bursts occur quite often in telecommunication channels and data storage systems such as
magnetic tapes and compact disks.

A set of binary errors in a word is called a burst. The length l of the burst is defined as the
number of binary positions between the first and the last error, inclusive. An (n,k) code that is
capable of correcting all error bursts of length l or less, but not all error bursts of length
(l + 1), is called an l-burst-error-correcting code. It may be shown that an (n,k) code C is
l-burst-error-correcting if no bursts of length 2l or less can be a codeword in C. The
following theorem, Lin and Costello (1983, p. 258), places an upper bound, known as Reiger
bound, on the burst error correction capability of an (n,k) code.

Theorem 8-3: The number of parity-check digits of an l-burst-error-correcting code must be


at least 2l, that is n – k ≥ 2l.

An (n,k) code that achieves equality in the inequality stated in the above theorem is said to be
optimal.

According to the discussion in section 6.1.3, any element ß in GF(2m ) can be represented by a
vector of dimension m, (b 0 , b 1 , …, b m-1 ) where {b i } i = 0..m – 1 lie in GF(2). This
representation of ß is referred to as an m-bit byte. Consider a t-error-correcting Reed-
Solomon code with code symbols from GF(2m ). If each element in GF(2m ) is represented by
its corresponding m-bit byte, the resulting code is a binary linear code with the following
parameters

code length n = m (2m – 1)


parity-check digits n – k = 2mt

During the decoding process, the binary received vector is divided into (2m – 1) m-bit bytes
and each m-bit byte is transformed back to a symbol in GF(2m ). If an error affects t or fewer

69
BCH Codes – Reed-Solomon Codes

of these m-bit bytes, it affects t or fewer symbols in GF(2m ) and can thus be corrected using
the decoding method for RS codes, demonstrated in section 8.3.

For example, RS codes of length 255 are used in many applications since each of the 256 = 28
field elements of GF(256) can be represented as a binary 8-tuple (byte) allowing the code to
be implemented using binary electronic devices.

The code is capable of correcting all error bursts of length (t – 1)m + 1, since such a burst
cannot affect more than t m-bit bytes. Note that this binary Reed-Solomon code can still
correct any combination of t or fewer random errors. Put more simply, in addition to being
effective for burst error correction, the code continues to be t-error-correcting.

With regard to Theorem 8-3, a binary RS code achieves (t – 1) m + 1 as maximum length for a
correctable burst, which is quite close to tm that is the optimal.

70
Performance of Error Detecting / Correcting Codes

9. PERFORMANCE OF ERROR DETECTING / CORRECTING CODES

The performance of error detecting / correcting codes includes several performance


parameters and, in general depends heavily on the specific application on which the code shall
be implemented. In this Chapter, performance measures such as the probability of
undetectable error, the probability of decoder error, the information rate and the error
detecting / correcting capacity of a code are discussed.

9.1 Error Detection Performance


As it has been shown in section 4.4.2, an (n,k) linear code detects all error patterns of (d – 1)
or fewer errors. In fact, it is also capable of detecting a large fraction of error patterns with d
or more errors as illustrated below.

The number of possible non-zero received words is 2n – 1 and thus there are 2n– 1
corresponding error patterns and exactly 2k– 1 of them, are identical to the non-zero
codewords of the (n,k) linear code. Since in any (n,k) linear code, the sum of two codewords
is a codeword, whenever one of these 2k– 1 error patterns occur, the transmitted codeword is
altered to another codeword resulting in an incorrect decoding. Thus, there are exactly 2k– 1
undetectable error patterns.

Consequently, there are exactly 2n – 2k detectable error patterns which are those not identical
to the codewords of the (n,k) code. It can be seen that for large n, the number 2n is much
bigger than 2k– 1, allowing a relatively small number of undetectable error patterns.

An upper bound on the probability of undetected word error Pu (E) can be obtained using the
minimum distance according to Wicker (1995, p. 240-2). The communication channel is
assumed to be the Binary Symmetric Channel, illustrated in Figure 9–1, which consists of
two symbols, 0 and 1, and the probability of error, p, is the same for both symbols.
1– p
0 0
p
input bit output bit

1 1
1– p

Figure 9–1

71
Performance of Error Detecting / Correcting Codes

Since an (n,k) linear code with minimum distance d is capable of detecting all error patterns
of weight (d – 1) or less, the probability of undetected error is bounded above by the
following expression
n
n
Pu (E) ≤ ∑  j  p (1 – p)
j =d
j n-j

where the binomial coefficient is the number of error patterns of weight j, where j starts from
d and p j (1 – p)n-j is the probability of occurrence of a particular error pattern of weight j.

For the non-binary case, the q m -ary Uniform Discrete Symmetric Channel is considered,
which consists of q m symbols. The probability that a symbol is correctly received is s, while
the probability that a particular incorrect symbol is received is (1– s). It can be shown, that
the probability of undetected error is bounded by the expression

 d −1
 n 
Pu (E) ≤ 1 –  ∑  j  (1 − s ) s n− j
j

 j= 0 

9.2 Error Correction Performance


Suppose that an (n,k) block code has minimum distance d and can correct t errors. A received
word v is matched to a codeword u such that the Hamming distance of v and u is not more
than t.

A decoder error occurs when a received word is matched to a codeword which was not the
one actually transmitted. The probability of decoder error P(E) is bounded above by the
probability of occurrence of error patterns of weight greater than t, since up to t errors are
corrected by the (n,k) code. Hence,

n
 n
∑  j  p (1 − p) n− j
P(E) ≤ j

j =t +1

In addition, if the number of codewords of weight j, Aj , is known, an exact expression for


P(E) can be obtained as

n t
P(E) = ∑ ∑P
j =d
Aj
k =o
k
j

72
Performance of Error Detecting / Correcting Codes

where Pk j is the probability that a received word is exactly Hamming distance k from a

weight-j codeword and is equal to

k
 j  n − j  j− k + 2r
Pk j = ∑  k − r  r
p
 (1 − p )n− j+ k −2r
r =o 

For the non-binary case and given the UDSC channel, it can be shown – through an
analysis based on probability theory, not pursued in this study – that the probability of
decoder error is given by the expression
n t
P(E) = ∑ A j ∑ Pk j
j =d k=o

where Pk j is now of the form


k
 j  n − j  j− k + r
Pk j = ∑  k − r  r
p
 (1 − p )k − r s n − j− r (1 − s )r
r =o 

9.3 Information Rate and Error Control Capacity


In analysing the performance of error detecting / correcting codes we are mainly interested in
the error control capacity of the code. This parameter is considered in relation to the length n
k
of the code or even better in relation to the information rate R = of the code, where k are
n
the information symbols and n is the code length.

In the discussion of basic binary codes in section 2.1 it was shown that the parity-check code
k
had very high information rate R = , since it uses a single parity-check digit, but can only
k +1
detect any odd number of errors and does not have error correction capabilities (t = 0). The
Triple Repetition codes which can detect a double error and correct a single error (t = 1) had
1
low information rate R = . The desirable, yet impossible, performance for an error control
3
code would be to achieve R approaching 1 while t approaches n. This imposes that the
performance of a code is associated with achieving a “moderate” pair (R, t), since one
parameter is considered in relation to the other.

For instance, the formulae for the parameters of Hamming codes, presented in section 5.1,
imply that the information rate R approaches 1 quite fast as m grows large. However,

73
Performance of Error Detecting / Correcting Codes

Hamming codes can always correct a single error and the importance of this error control
capacity fades as the code length n increases. Consider correcting one error in a code length
of 7 (m = 3) and one error in a length of 63 (m = 6). Therefore, it is the shorter Hamming
codes that are used in practical applications.

As described in section 8.2, Reed-Solomon codes can be defined over Galois fields of order
q m , though in the coding literature they are usually discussed over GF(2m ) due to the
simplicity of their implementation using binary electronic devices. Table 9–1, based on Reed
and Chen (1999, p. 246), lists all Reed-Solomon codes defined over the Galois fields GF(2m )
for m ≤ 4. A column for the minimum distance d of each code has been added. The length of
each code is the column under n, the number of information symbols is under k, the number
of correctable errors is under t and the information rate of each code is under R.

m n k d t R (%)

2 3 1 3 1 33.3%
3 7 5 3 1 71.4%
3 7 3 5 2 42.9%
3 7 1 7 3 14.3%
4 15 13 3 1 86.7%
4 15 11 5 2 72.3%
4 15 9 7 3 60.0%
4 15 7 9 4 46.7%
4 15 5 11 5 33.3%
4 15 3 13 6 20.0%
4 15 1 15 7 6.7%

Table 9–1

It can be seen that the number of possible Reed-Solomon codes increases quite fast as m
increases. In order to choose an RS code, the error correcting capacity t must be considered
in relation to the information rate R. For instance, a reasonable choice would be the (15,9) RS
code with t = 3 and R = 0.6 since it corrects 3 errors in a length of 15 and has moderate
information rate. However, considering the frequency of occurrence of errors in the

74
Performance of Error Detecting / Correcting Codes

transmission channel used by a specific application, the (15,11) RS code could be chosen,
though it corrects 2 errors in a length of 15, as it is more economical with information rate
R = 0.723. If the channel used, induces a relatively large number of errors then error
correction may be given higher priority and the (15,7) RS code could be a good choice since it
corrects 4 errors (t = 4) in a length of 15. Yet, this code is uneconomical since its information
rate R is R = 0.467. In any case, Table 9–1 reveals the importance of the class of Reed-
Solomon codes which has a wide range of codes likely to be used in a number of practical
applications.

Another parameter to be considered in the performance analysis of error control codes is the
implementation complexity; the amount of hardware and software required for realisation of
the encoding and decoding processes. This parameter is also subject to the specific
application, since hardware and software for applications with low data rate such as the audio
bit-stream for the compact disk with 1.41 Mbits/sec is substantially different from the
requirements for high data rate applications such as High-Definition Television with 19.3
Mbits/sec. Consequently, powerful encoding and decoding algorithms of error detecting /
correcting codes need to have feasible implementation in order to be regarded as efficient.

75
Error Control Strategies and Applications

10. ERROR CONTROL STRATEGIES AND APPLICATIONS

The uses of error detecting / correcting codes are continuously expanding, in a way,
proportionally to the technological developments in applications concerned with transmission
of information. The type of error detection and correction coding deployed by a real
communication system mainly depends on the application. Parameters such as the channel
(or storage medium) properties, the amount of transmitted power and digital equipment
limitations primarily determine the error control strategy adopted. As exhibited throughout
the coding theory literature, three preponderant error control schemes are in play; forward
error correction, automatic repeat request and hybrid error control.

10.1 Error Control Strategies


Communication systems often employ two-way channels; that is, information can be sent in
both directions so that the transmitter acts as a receiver and vice versa. Error control for such
systems can be accomplished by use of error detection and retransmission, called automatic
repeat request (ARQ). In an ARQ system, when an error is detected at the receiver, a request
is sent to the transmitter to repeat the message. Examples of ARQ systems include telephone
channels and fax modems.

In other communication systems transmission is strictly in one direction, from transmitter to


receiver. The codes used, in such circumstances, are primarily designed for error correction.
Error control for such systems is called forward error correction (FEC). For example, data
recorded on a magnetic tape storage system may be read weeks after it is recorded and
retransmission is impossible.

The major advantage of ARQ over FEC is that it is adaptive, since error processing
(retransmission) is performed only when errors occur. However, error control codes tend to
use long codewords for efficient error detecting. As a result, if retransmission is requested
frequently, the receiver may experience delays in receiving the original message. A mixed
error control strategy, called hybrid error control (HEC), addresses the problem by using
error correction for the most frequent errors in combination with error detection and
retransmission for the less frequent error patterns.

Furthermore, another approach to specifying the error control strategy suggests considering
what types of data need to be strongly protected during transmission. For example, in a

76
Error Control Strategies and Applications

computer system there is greater sensitivity to errors in the machine instructions than in the
user’s data.

10.2 Error Control Applications


Error detecting / correcting codes are implemented in almost any application that includes the
keywords “transmission” and “information”, taken in their broadest sense. For instance,
transmission may refer to the storage of data on a computer hard disk and the retrieval of that
data.

The uses of error control codes are ever expanding and include:

• radio links
• long-distance telephony
• television (High-Definition Television)
• data storage systems
• compact disk (Digital Versatile Disk)
• international data networks
• wireless communications
• deep-space communications (satellites, telescopes, space probes)

Error detecting / correcting codes are widely used for improving the reliability of computer
storage systems. The requirement for such systems, which at first used core memories, was
for a single error correcting / double error detecting (SECDED) code. The first error control
scheme to be implemented on computer memories were the Hamming codes which have this
error control capacity, as discussed in section 5.1. Especially after core memories were
replaced by semi-conductor memories, which are faster but their high density per chip induces
more errors, error control codes became an essential design feature in computer storage
systems.

A commonly used protection method is to apply two levels of coding, called concatenation.
The information sequence is encoded with a code C2 (external code) and transformed from a
sequence of length k to an encoded sequence of length n. This new sequence is regarded as
the information for a second code C1 (internal code) and is accordingly again encoded. This
process results in a concatenated code which is effective against a mixture of error bursts and
random errors. In general, the external code prevents errors generated by incorrect decoding
of the internal code.

77
Error Control Strategies and Applications

The most commonly adopted combination is that of a convolutional internal code with a
Reed-Solomon external code. Concatenated convolutional and RS codes find widespread use
in applications concerned with the reliable transmission of data representing photographs or
video such as satellite telecommunications, deep-space telescopes and, recently,
High-Definition Television.

Currently, the most popular class of codes for communication and storage systems seems to
be the class of Reed-Solomon codes. The prevalence of these codes over most of the codes
presented in this study, can be chiefly justified by their powerful algebraic decoding algorithm
in combination with their byte symbol structure.

For example, the compact disk error control system employs Reed-Solomon codes, actually
two RS codes defined over the Galois field GF(28 ) according to Reed and Chen (1999,
p. 300-7). Each symbol from GF(28 ) can be represented by a binary 8-tuple (byte ) based on
the development of section 8.4. The compact disk digital audio system uses the byte symbol
structure of RS codes, since these 8-bit symbols from GF(28 ) prove to be suitable, from an
electrical engineering perspective, for the 16-bit samples obtained from the original analog
music source.

The length of an RS code over GF(28 ) is 255 leading to 2040-bit codewords, which would
result in increased complexity and high implementation cost. In order to minimise this cost,
as the compact disk application focused on retail sales, a concatenation of two shortened
Reed-Solomon codes are implemented on its error correction system. The external code is a
(28,24) RS code and the internal is a (32,28) RS code which both still use 8-bit symbols from
GF(28 ). Their concatenation combined with their byte symbol structure, provides protection
against error bursts caused by material imperfections or fingerprints and scratches that may
occur when handling the CD.

78
Afterword

AFTERWORD

As observed throughout this study of error detecting / correcting codes, the encoding process
is likely to be less extensive than decoding. In fact, the encoder performs a single task; that
is, transforming each message word to a codeword by adding redundancy. In contrast, the
decoder conducts three tasks; detection of errors, error processing in case of occurrence of
errors, extraction of the original message word from the codeword. These processes reveal a
substantial decoding complexity and imply extensive digital equipment requirements. Given
that decoding is primarily based on the encoding rule, the design of error detecting /
correcting codes should focus on optimising the encoding process in such a manner that it
simplifies decoding.

Error detecting / correcting codes cannot eliminate the probability of erroneous transmission.
Yet, they contribute to a significant reduction in the effects of noisy transmission channels.
Practical applications demand codes with a specified error control capability. Such codes are
bound to emerge once they are proved mathematically to satisfy the required error control
capacity.

Towards this direction, advanced algebraic techniques are exploited in the construction
process of error control codes. The resulting codes gain in error control capability, but this
benefit is counterbalanced by increased implementation complexity. Clearly, the portion of
available algebraic tools applied should overlap with feasible implementation.

Put more simply, rigorous research on advanced applied mathematics should be performed in
parallel to research on engineering developments that will withdraw severe implementation
preoccupations.

This combination formula can facilitate innovative approaches to error control systems for
efficient transmission of information.

79
Appendices
Appendix A

APPENDIX A

1. Groups
A set of elements G that constitutes a group under a binary operation ‘·’ has only one identity
element.

Proposition 1: The identity element in a group G is unique.

Proof : Assume that there exist two identity elements e and e´ in G.


Then, e´= e´· e = e implies that e and e´ are identical. Thus, there is only one identity
element in G.

Proposition 2: The inverse of a group element in G is unique.

Proof: Assume that there exist two inverse elements a´ and a´´ for an element a in G.
Then, a´= a´· e = a´·(a · a´´) = (a´· a)· a´´= e · a´´= a´´.
Hence, a´= a´´ implying that the inverse a´ of a group element a is unique.

When the operation imposed on a set of elements for the formation of a group is
multiplication ‘·’, the inverse element of a is often written as a -1 and the unit element is
written as 1. We also write a i for a · … · a (i times).

When the operation is taken to be addition ‘+’ the inverse element of a ∈ G is often written
as –a and the unit element is written as 0.

The set of all permutations of {1, 2, …, n} is a group under the composition of functions. It is
called the symmetric group and its cardinality is equal to n!.

Let G be a group and H be a non-empty subgroup of G. The following statements are


equivalent:
1. H is a subgroup of G
2. H under the operation defined in G, is a group
3. x·y∈ H and x-1 ∈ H, for all x,y ∈ H
4. x·y-1 ∈ H, for all x,y ∈ H

A subgroup H of G is said to be a proper subgroup of G if S ⊂ G but S ? G.

81
Appendix A

If G is a commutative (or abelian) group and H a subgroup, then the sets aH = {a·h | h ∈ H}
are called cosets of H. The cosets again form a group, if multiplication of cosets is defined by
(aH)·(bH) = a·bH. This group is called the factor group, denoted by G | H.

Lagrange’s Theorem: The order of a subgroup H of G divides the order of G.

Proof: Let a,b ∈ H (a ? b) and x∈ G – H.


Then, a·x ? b·x, for otherwise a·x·x-1 = b·x·x-1 implies that a = b which contradicts the
initial assumption.
If an element x is in a coset A of H in G, then all other elements in A are in a one-to-
-one relationship with elements a ∈ H defined by y = a·x.
Thus, each coset of H has the same number of elements as H.
Since the cosets are also disjoint, the result follows.

2. Fields
The following properties can be derived from the definition of a field GF(q):
1. a·0 = 0·a = 0, for any a ∈ GF(q)
2. If a·b = 0, a ≠ 0, then b = 0, a,b ∈ GF(q)
3. For all a,b ∈ GF(q) with a ≠ 0 ≠ b, a·b ≠ 0
4. – (a·b) = (–a)·b = a·(– b) for all a,b ∈ GF(q)
5. If a·b = a·c, a ≠ 0 then b = c

We can regard a field GF(q) as having four operations, addition, subtraction, multiplication
and division by non-zero element – where subtraction and division are performed using the
inverse element in GF(q) – with the understanding that a – b = a +(–b) and a / b = a·b -1 ,
(b ? 0) for all a,b ∈ GF(q).

The following theorem is useful for performing field arithmetic in fields of characteristic ?.

Theorem A-1: In a field GF(q) of characteristic ?, (x ± y)? = x? ± y? for any x,y variables or
elements in GF(q).

Proof: By expanding (x + y)? by the binomial theorem we get

λ λ
(x + y)? = x? +   x?-1 y +   x?-2 y2 + … + y?
1  2 

82
Appendix A

λ
Every term except the first and the last is multiplied by   for i ≥ 1.
 
i
Each of these binomial coefficients has ? as a factor when multiplied out.
Hence, in a field GF(q) they are all 0 and the result follows.
For (x – y)?, replace y by –y in the above expression.

In the discussion about fields, in section 6.1.1 it is mentioned that a finite field of order q is
unique up to isomorphisms.

Two fields GF(q) and GF(q´) are called isomorphic if there exists a map f from GF(q) to
GF(q´) such that:
i. f is bijective; that is, for each element ß ∈ GF(q´) there is exactly one element
a ∈ GF(q) with f (a) = ß
ii. for any a,ß ∈ GF(q), f (a·ß) = f (a)·f (ß), and f (a + ß) = f (a) + f (ß)

The map f is called an isomorphism. If GF(q) = GF(q´), then f is called an automorphism of


GF(q).

An isomorphism amounts to relabelling of the elements of the field that preserves ‘+’ and ‘·’,
implying that f (0) = 0 and f (1) = 1.

Theorem A-2: If a field of q = pm elements exists, then it is unique up to isomorphisms. We


call this field GF(p m ).

3. Vector spaces
A measure of the degree of freedom available to the elements of a vector space is an invariant
called the dimension or rank of the vector space. It is described through the concept of linear
independence.

A finite subset S = {v1 , v2 , …, vk} of a vector space is said to be linearly independent, if a


linear combination of these vectors is 0 if and only if all the coefficients are 0. The dimension
or rank dim(V) of a vector space V is the largest possible number of elements in a linearly
independent subset of V.

A linearly independent subset B of a vector space V is called a basis of V if for any vector
v ∈ V, v is a linear combination of elements of B.

83
Appendix A

As for the dual space S- of a vector subspace S of a vector space V over GF(q), the
Dimension Theorem dictates a property to be satisfied by their dimensions.

The Dimension Theorem: Let S be a finite-dimensional vector subspace of V and let S- be


the corresponding dual space. Then, the dimension of S and the
dimension of S- sum to the dimension of V.

dim(S) + dim(S- ) = dim(V)

4. Matrices
A matrix is an mx n array of entries a ij which, for the purposes of coding theory, are assumed
to be elements of a finite field GF(q).

The matrix with entries (a ij ) is denoted by A = (a ij ). A row vector is a 1x n matrix and a


column vector is an mx 1 matrix.

The set of vectors of length n, denoted by Fn , comes equipped with standard operations for
adding vectors and multiplying them by elements of GF(q), called scalars

(x1 , …, xn) + (y1 , …, yn ) = (x1 + y1 , …, xn + yn )

and a(x1 , …, xn ) = (ax1 , …, axn )

Coding theory is concerned with choosing subsets of Fn , which are closed under the
operations of vector addition and scalar multiplication defined above.

If A = (a ij ) is an mx n matrix and B = (b jk) is an n x s matrix, the product A·B is an mx s matrix C


with entries
n
cik = ∑a b
j =1
ij jk

Note that matrix multiplication between A and B is defined only when the number of columns
of A is equal to the number of rows of B.

The following properties for matrix multiplication can be derived from its definition.

84
Appendix A

Let A be an mx n matrix, B an nx s matrix and D an sx r matrix, with entries from a finite field
GF(q). Then,
1. A·(B+C) = A·B + A·C
2. A·(a·B) = a·(A·B), a ∈ GF(q)
3. A·(B·D) = (A·B)·D

Note that, in general A·B ≠ B·A.

An important application of matrix multiplication, exploited in encoding / decoding schemes


for error detecting / correcting codes, is in representing linear equations.

A set of m linear equations in n unknowns,

a 11x1 + a12 x2 + … + a 1nxn = b1


a 21 x1 + a 22x2 + … + a 2nxn = b2
.
.
.
a m1 x1 + am2x2 + … + a mnxn = b m

can be expressed as A·x = b, where A = (a ij ) is the mx n matrix of coefficients, x is the n x 1


column vector of unknowns and b is the mx 1 column vector of constants

x1 b1
a11 a12 . . . a1n
x2 b2
a 21 a22 . . . a 2n
. .
. · =
. .
.
. .
.
xn bm
a m1 am 2 . . . amn

85
Appendix B

APPENDIX B

The following discussion investigates the existence of an element ß with order n in some field
GF(q m), since such an element is central to the process of factoring xn – 1.

If n is a divisor of (q m – 1), then there are exactly f(n) elements of order n in GF(q m ), where
f(n) is the Euler f function evaluated for n, according to property 2 of Theorem 6-3. For
positive n, f(n) is always greater than zero and thus the field GF(q m ) contains at least one
element of order n. Consequently, if a positive integer m can be found such that n | (qm – 1),
the existence of a primitive n-th root of unity is guaranteed in an extension field GF(q m ) of
GF(q).

The next step is to determine in precisely which extension field of GF(q), the element ß of
order n can be found. If m is the order of q modulo n – that is, m is the smallest positive
integer such that n divides (q m– 1) – then GF(q m ) is the smallest extension field of GF(q) in
which a primitive n-th root of unity can be found.

Factorisation of xn– 1 over GF(2)


A short list of factorisations of xn – 1 over the field GF(2):

x3 – 1 = (x + 1)(x2 + x + 1)

x5 – 1 = (x + 1)(x4 + x3 + x2 + x + 1)

x7 – 1 = (x + 1)(x3 + x + 1)(x3 + x2 + 1)

x9 – 1 = (x + 1)(x2 + x + 1)(x6 + x3 + 1)

x11– 1 = (x + 1)(x10 + x9 + x8 + x7 + x6 + x5 + x4 + x3 + x2 + x + 1)

x13– 1 = (x + 1)(x12 + x11 + x10 + x9 + x8 + x7 + x6 + x5 + x4 + x3 + x2 + x + 1)

x15– 1 = (x + 1)(x2 + x + 1)(x4 + x + 1)(x4 + x3 + 1)(x4 + x3 + x2 + x + 1)

x17– 1 = (x + 1)(x8 + x5 + x4 + x3 + 1)(x8 + x7 + x6 + x5 + x4 + x2 + x + 1)

Note that further factorisations can be obtained considering that over GF(2),

x 2 n – 1 = (x n − 1)
k 2k

86
Appendix C

APPENDIX C

The cyclic property and the property of linearity allow the process of shifting and addition
respectively, which can be applied to generate cyclic codes as depicted in the following
example, presented by Sweeney (1991, p. 47).

Consider the field GF(23 ). The factors of x7– 1 are x7– 1 = (x + 1)(x3 + x + 1)(x3 + x2 + 1). Out
of the two polynomials of degree 3, both primitive, x3 + x + 1 can be arbitrarily chosen to
construct a cyclic code of length 7. Using x3 + x + 1 as the generator polynomial, which
corresponds to the generator sequence 0001011, all the codewords of length 7 can be
constructed as

1 0000000
2 generator sequence 0001011
st
3 1 shift 0010110
nd
4 2 shift 0101100
rd
5 3 shift 1011000
6 4th shift 0110001
th
7 5 shift 1100010
th
8 6 shift 1000101
9 sequences 2 + 3 0011101
10 1st shift 0111010
nd
11 2 shift 1110100
rd
12 3 shift 1101001
13 4th shift 1010011
14 5th shift 0100111
th
15 6 shift 1001110
16 sequences 2 + 11 1111111

The first codeword was taken to be the all-zero sequence, since the all-zero vector is always a
codeword of a linear code. Then, starting from the generator sequence and shifting it
cyclically until all seven positions have been registered, the next seven codewords were
constructed. Then, two of those sequences are found, such that, when added they give a new
sequence. That sequence is cyclically shifted until all seven positions have again been
registered, to construct the next seven codewords. Finally, two sequences are found such that,
when added, they form the all-one codeword which remains the same even if shifts are
applied. Further shifting and addition cannot produce any new codewords.

87
Appendix C

The code constructed in this way, has 16 codewords and consequently, has 4 information bits,
thus is a (7,4) code. Its minimum distance is 3, as the minimum weight of a non-zero
codeword is 3. It can be seen that the code, thus produced is a cyclic version of the [7,4]
Hamming code.

88
Appendix D

APPENDIX D

Error detecting / correcting codes mostly employ code symbols from the binary finite field GF(2) or
its extension GF(2m ) since information in digital-data transmission and storage systems is universally
coded in binary form. In fact, error control codes are implemented on bi-stable electronic devices and
therefore, data in binary form is suitable for realisation of encoding and decoding schemes.

Binary addition and multiplication are basic to binary field arithmetic. According to Reed and Chen
(1999, p. 51) these elementary operations can be implemented using standard logic gates as depicted
in Figure D–1.

XNOR Gate for ‘+’ AND Gate for ‘·’

Figure D–1: Logic gates for binary field arithmetic

As for systematic encoding of linear block codes, discussed in section 4.3.3, a binary linear systematic
encoder is illustrated in Figure D–2, as proposed by Reed and Chen (1999, p. 82).

n – k parity
columns of ROM . Address
generator .
matrix appear k x(n – k) . generator

message
word
codeword

.......

parity bits

accumulators ...
. . .

Figure D–2: Binary linear systematic encoder

89
Appendix E

APPENDIX E

The following table lists all 24 message words and the corresponding codewords of the [7,4]
Hamming code, as found in Lin and Costello (1983, p. 67).

Message words Codewords


0000 0000000

1000 1101000

0100 0110100

1100 1 011100

0010 1110010

1010 0011010

0110 1000110

1110 0101110

0001 1010001

1001 0111001

0101 1100101

1101 0001101

0011 0100011

1011 1001011

0111 001 0111

1111 1111111

Actually, this table can be constructed by following the process described in section 5.1.1 for all 24
message words to obtain their corresponding codewords.

The program presented in section 5.1.1.1 takes as input an element (a message word) of the first
column and produces as output the corresponding codeword of the second column.

90
Appendix E

The program presented in section 5.1.1.2 takes as input a received word. This received word can
either be an element (a codeword) of the second column or a codeword with one complemented bit
(single error) or a codeword with two complemented bits (double error). The program outputs the
index of the erroneous coordinate of the received word and the actual transmitted codeword.

91
Bibliography

BIBLIOGRAPHY

Berlekamp, E. R. (1968) Algebraic Coding Theory, McGraw-Hill Series in Systems Science,


McGraw-Hill, New York

Berlekamp, E. R. (1984) Algebraic Coding Theory, Revised 1984 Edition, Aegean Park Press, ISBN
0-89412-063-8

Dholakia, A. (1994) Introduction to Convolutional Codes with Applications, Kluwer Academic


Publishers, ISBN 0-7923-9467-4

Jones, G. A., Jones J. M. (2000) Information and Coding Theory, Springer Mathematics Series,
Springer, ISBN 1-85233-622-6

Lin, S., Costello, J. C. (1983) Error Control Coding: Fundamentals and Applications, Prentice-Hall,
ISBN 0-13-283796-X

van Lint, J. H. (1992) Introduction to Coding Theory, Graduate Texts in Mathematics, Second
Edition, Springer, ISBN 3-540-54894-7

van Lint, J. H. (1999) Introduction to Coding Theory, Graduate Texts in Mathematics, Third Edition,
Springer, ISBN 3-540-64133-5

Peterson, W. (1961) Error-Correcting Codes, The M.I.T. Press, Massachusetts Institute of


Technology, Cambridge, Massachusetts

Pless, V. (1989) Introduction to the Theory of Error-Correcting Codes, Second Edition, Wiley-
Interscience Series in Discrete Mathematics and Optimisation, John Wiley & Sons Inc, ISBN 0-471-
61884-5

Pless, V. (1998) Introduction to the Theory of Error-Correcting Codes, Third Edition, Wiley-
Interscience Series in Discrete Mathematics and Optimisation, John Wiley & Sons Inc, ISBN 0-471-
19047-0

Poli, A., Huguet, L. (1992) Error Correcting Codes: Theory and Applications, Prentice Hall -
Masson, ISBN 0-13-284894-5

90
Bibliography

Pretzel, O. (1996) Error-Correcting Codes and Finite Fields, Oxford Applied Mathematics &
Computing Science Series, Student Edition, Clarendon Press, Oxford ISBN 0-19-269067-1

Reed, I. S., Chen, X. (1999) Error-Control Coding for Data Networks, Kluwer Academic Publishers,
ISBN 0-7923-8528-4

Roman, S. (1997) Introduction to Coding and Information Theory, Springer, ISBN 0-387-94704-3

Sweeney, P. (1991) Error Control Coding, An Introduction, Prentice Hall, ISBN 0-13-284126-6

Wicker, B. S. (1995) Error Control Systems for Digital Communication and Storage, Prentice Hall,
ISBN 0-13-200809-2

91

View publication stats

You might also like