0% found this document useful (0 votes)
60 views154 pages

CT3 PDF

This document provides an introduction to coding theory and linear codes. It discusses how error correcting codes are used in applications like CD players, hard drives, and telephone communications to protect against errors during transmission and storage. It provides examples of simple repetition codes and more efficient Hamming codes. It also discusses the story of how error correcting codes were used for the Mariner space missions to Mars to allow for reliable transmission of images in the presence of noise.

Uploaded by

Mr. RAVI KUMAR I
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views154 pages

CT3 PDF

This document provides an introduction to coding theory and linear codes. It discusses how error correcting codes are used in applications like CD players, hard drives, and telephone communications to protect against errors during transmission and storage. It provides examples of simple repetition codes and more efficient Hamming codes. It also discusses the story of how error correcting codes were used for the Mariner space missions to Mars to allow for reliable transmission of images in the presence of noise.

Uploaded by

Mr. RAVI KUMAR I
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 154

Coding Theory and Applications

Linear Codes

Enes Pasalic
University of Primorska
Koper, 2013
2
Contents

1 Preface 5

2 Shannon theory and coding 7

3 Coding theory 31

4 Decoding of linear codes and MacWilliams identity 53

5 Coding theory - Constructing New Codes 77

6 Coding theory - Bounds on Codes 107

7 Reed-Muller codes 123

8 Fast decoding of RM codes and higher order RM codes 141

3
4 CONTENTS
Chapter 1

Preface

This book has been written as lecture notes for students who need a grasp
of the basic principles of linear codes.
The scope and level of the lecture notes are considered suitable for under-
graduate students of Mathematical Sciences at the Faculty of Mathematics,
Natural Sciences and Information Technologies at the University of Primorska.
It is not possible to cover here in detail every aspect of linear codes, but I
hope to provide the reader with an insight into the essence of the linear codes.

Enes Pasalic
[email protected]

5
6 CHAPTER 1. PREFACE
Chapter 2

Shannon theory and coding

Contents of the chapter:


• Mariners
• Course description

• Decoding problem
• Hamming distance
• Error correction

• Shannon

7
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Coding theory - introduction

Coding theory is fun (to certain extent :)

Can we live without error correction codes ?

– Probably not !!

What would you miss :

You would not be able to listen CD-s, retrieve correct data from
your hard disk, would not be able to have a quality communication
over telephone etc.

Communication, storage errors, authenticity of ISBN numbers and


much more is protected by means of error-correcting codes.

1 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Students’ favorite application

• One of the most popular applications in CD players

• CD records becomes scratchy (the quality is getting worse)

• Each tiny scratch would cause a noise when listening the


music (worse than vinyl)

• Problem: Invent a good code that can correct burst errors


(consecutive errors)

• Solution: Use an encoder and decoder based on the


Reed-Solomon codes !

2 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Coding theory - repetition code

• Most of the storage media is prone to errors (CDs, DVDs,


magnetic tapes).
• In certain applications errors in retrieved data are not
acceptable.
• Need some redundancy of information, i.e. instead of saving 1
and 0 we can save 000 and 111.
• Example of a simple repetition code
• How do we retrieve the information - simply if no error
000 → 0 and 111 → 1.
• If only one error then majority rules,

000, 001, 010, 100 → 0


111, 101, 110, 011 → 1

3 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Coding theory - repetition code II

• What about correcting 2 errors ? Nothing we can do with this


code, e.g. 000 → 110 and we decode 0 as 1 !
• Why not use repetition code of length 5 ? Then we can
correct up to 2 errors ?
• Indeed 00000 → 00011 it is still decoded as 0 !

• The problem is that this approach is not quite efficient 5


times more data.
• One of the main goals of coding theory is to increase
efficiency.
• Main idea is to encode a block of bits and not a single bit !

4 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Coding efficiency

• For instance Hamming code takes a block of k = 4 bits and


encode it into a block of n = 7 bits; still can correct 1 error !
Comparison:
• Repetition code: 1 bit encoded as 3 bits
• Hamming code: 4 bits encoded as 7 bits

• We may talk about coding efficiency (code rate) - clearly the


Hamming code is better; using less redundancy for the same
error correction capability.
• We may wish to correct more than a few errors in a codeword
- other codes such as Reed-Muller code exist.

5 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Mariner story
• Back in 1969 Mariners (Voyagers etc.) were supposed to send
pictures from Mars to Earth
• The problem was a thermal noise to send pixels with grey
scale of 64 level.

• Redundancy was introduced - 6 bits (64 scale grades) encoded


as a 32-bit tuple.
6 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Mariner story - encoding

• Such an encoding could correct up to 7 errors in transmission.

• Correcting errors is not for free- we have to send bits 32/6


times faster.

• This means that the total energy per bit is reduced - this
causes increased probability of (bit) error !

• Have we overpaid the capability of correcting errors ?

• The answer lies in computing coding gain - if positive then we


save energy (reduce the probability of error).

7 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Error probability in noisy channels

• Assume that a transmitter has a total energy per bit Eb



available. E.g. to √
send “1” a signal with amplitude s = Eb
is sent and s = − Eb for “0”.
• In presence of AWGN (Additive White Gaussian Noise) the
received signal is
r = s + n,
n has zero mean and variance σ 2 .
• Hard decision decoding: r > 0 “1” sent; “0” otherwise. Then
the bit error probability is,
Z r !

1 −y 2 Eb
pe = √
√ exp( 2 )dy = Q .
Eb 2πσ 2 2σ σ2

8 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Error probability for Mariner

• Assumption: Each block of 6 bits may be wrong with


probability PE < 10−4 .
• In case of no coding we need Eb /σ 2 = 17.22 as,

pe = Q( 17.22) ≈ 10−4 /6 and PE = 1 − (1 − pe )6 ≈ 10−4 .

• Compute pe for given PE and get SNR=Eb /σ 2 .


• In Mariner 6 bits encoded as 32 bits, i.e. energy per bits
decreases: r !
6Eb
pe0 = Q
32σ 2
• For given SNR = 17.22 pe0 = 0.036 – 2000 times larger than
pe

9 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Coding gain for Mariner

• The benefit is in error correction. After decoding 32 bits to 6


bits,
X 32
PE0 = (pe0 )i (1 − pe0 )32−i ≈ 1.4 · 10−5 .
i
i>7

• Even better results if soft decoding is used.


• The use of coding may be viewed as saving the energy ! The
code used in Mariner was a [32, 6] Reed-Muller code.
• For Mariner example to get PE0 = 10−4 an SNR of 14.83 is
required (instead of 17.22).

Definition The ratio between SNR (uncoded) and SNR (coded)


for equal error probability after decoding is called the coding gain.

10 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

ISBN

The International Standard Book Number (ISBN) is a 10-digit


codeword such as 0-521-55374-1.
• The first digit indicates the language (0 or 1 for English).
• The next group of digits specifies the publisher (521 for
Cambridge University Press).
• The next group of 5 digits forms the book number assigned by
the publisher (the groups of digits are of variable length).
• The final digit x10 is chosen so that the entire number
x1 x2 . . . x10 satisfies the following check equation:
10
X
xi = 0 (mod 11).
i=1

11 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

ISBN - example

The redundant bit offers a simple error correction.

Example The sixth digit in the ISBN 0 − 7923 − 519 − X has


faded out. We want to find the missing digit.

– When x10 = 10 the value is represented by the letter X .

The missing digit x6 satises the equation, modulo 11,

0 = 1 · 0 + 2 · 7 + 3 · 9 + 4 · 2 + 5 · 3 + 6 · x6 + 7 · 5 + 8 · 1 + 9 · 9 + 10 · 10,

which gives 6x6 = 9 (mod 11), i.e. x6 = 7.

12 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Course topics

• The following topics will be covered in the course:

1. Linear codes with emphasis on Hadamard codes

2. Golay and Reed-Muller codes

3. Cyclic codes and BCH codes

4. Reed-Solomon codes and perfect codes

5. Constructing new codes from known ones

6. Asymptotically good codes and algebraic geometry codes

7. Bounds on codes and convolutional codes . . .

13 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Block code

Definition A block code of length n containing M codewords over


the alphabet A is a set of M n-tuples where each n-tuple takes its
components from A. Denoted [n, M] code over A.
Example Let A = {0, 1} and consider a [5, 4] code defined by its
codewords:

c0 = (00000) c1 = (10110)
c2 = (01011) c3 = (11101)

• What are the properties of such a code ? Linearity, rate,


error-correcting capability etc.
• Linearity is (almost) obvious c1 + c2 = c3 using bitwise
modulo two addition !

14 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Redundancy of the code


• How many information bits can be carried over ?
• Total of 4 codewords means that 2 bits are transmitted.
• Redundancy measures amount of extra bits

r =n−k

In our case n − k = 5 − 2 = 3. Three extra bits for the


purpose of correcting/detecting errors !
• Need to specify the mapping from information to codewords.
• E.g. we may have,

(00) 7→ (00000) (01) 7→ (01011)


(10) 7→ (10110) (11) 7→ (11101)

15 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Rate of the code

Definition The rate of an [n, M] code which encodes information


k-tuples is
k log|A| M
R= = .
n n

• In our example the rate is R = 25 , good or bad ?

– Hard to answer - several issues to be considered :

• Depends on application; how many errors we need to correct


and what is the error probability of the channel
• What we do know: There exist codes of long length (n → ∞)
so that the probability of error after decoding → 0 !!

16 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Coding Alphabet

• In the previous example we assumed the alphabet A = {0, 1}.

• Easiest case - binary. We consider in general:

• A is q-ary alphabet
• q = 2, q = p > 2, q = p m or sometimes
• A = {a, b, c, d}

In general, increasing the coding alphabet may improve the


performance of the code, but decoding complexity is a
problem

17 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Transmission scheme

00 00000
DATA 01 01011
Encryption Coding
10 10110
11 11101

Transmission
channel (noisy)

0000 1 00000
Decoding Decryption
?

18 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Decoding problem

– Given an [n, M] code C received vector r there are several


choices:
• no errors have occurred - accept r as a sent codeword
• errors have occurred; correct r to a codeword c
• errors have occurred - no correction possible

Three main strategies (depends on the application):

1. error correction
2. error detection (retransmission request)
3. Hybrid approach both correction and detection

19 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Hamming distance - definition

Definition The Hamming distance d(x,y) between two codewords x


and y is the number of coordinate positions in which they differ.
• E.g. the Hamming distance between x= 01011 and y=10110
is 4.
The Hamming distance of an [n, M] code is a minimum distance
between any two codewords

d = min d(x, y).


x,y∈C

• Computing the minimum distance of the code requires


 M M2
calculating 2 ≈ 2 Hamming distances.

20 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Hamming distance - properties

Three simple properties:


1. d(x, y) ≥ 0
2. d(x, y) = d(y, x)
3. d(x, y) + d(y, z) ≥ d(x, z) - triangle inequality (exercise)

• Nearest neighbor decoding (minimum distance) uses the


Hamming distance in decoding.

IDEA: Given a received n-tuple r find the closest


codeword c to r (if it exists) and correct r to c

• What if several codewords are equally close ?


• Either retransmission or pick up a codeword at random.

21 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Maximum likelihood decoding


• Nearest neighbor decoding justified through maximum
likelihood decoding.
• IDEA: Maximize the probability

maxc∈C Pb(r, c),

Pb(r, c) - the probability that r is received, given that c is


sent.
• Assumptions:
• A code with an alphabet of q symbols
• p error probability for each symbol
• If d(r, c) = d then

p d
Pb(r, c) = (1 − p)n−d ( ) .
q−1

22 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Maximum likelihood decoding II

• Suppose c1 and c2 are two codewords, and r is received.


Furthermore assume d(r, c1 ) = d1 ≤ d(r, c2 ) = d2 .
• Wonder when Pb(r, c1 ) ≥ Pb(r, c2 ) ?
• If this holds then
p d1 p d2
(1 − p)n−d1 ( ) > (1 − p)n−d2 ( )
q−1 q−1
so that
 d2 −d1
d2 −d1 p d2 −d1 p
(1 − p) >( ) ⇒ <1
q−1 (1 − p)(q − 1)

• Thus, d2 ≥ d1 implies for p < q−1


q the max. likelihood is
sound.

23 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Decoding using the maximum likelihood - example

• Again let C = {00000, 10110, 01011, 11101} and p = 0.1. If


r = (11111) is received then,

Pb(r, 00000) = (0.1)5 = 0.00001


Pb(r, 10110) = (0.1)2 (0.9)3 = 0.00729
Pb(r, 01011) = (0.1)2 (0.9)3 = 0.00729
Pb(r, 11101) = (0.1)1 (0.9)4 = 0.06561

• Pb(r, 11101) is largest, thus r is decoded as 11101.

One error could be corrected, but we may be satisfied only with


detection of errors. How many errors we can detect ?

24 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Error correcting capability

Theorem If C is an [n, M] code with d ≥ 2e + 1, then C can


correct up to e errors. If used for error detection only, C can
detect 2e errors.

Proof (Sketch) Let ci , 1 ≤ i ≤ M be the codewords of C and


define
Sci = {x ∈ An : d(x, ci ) ≤ e}
where A is alphabet of C and Sci is sphere of radius e around the
ci . Then Sci ∩ Scj = ∅.

One of the most important concepts in coding theory -


visualization on the next slide

25 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Codeword spheres

.
00001 00110
. .
10111
00010 . .
10000 . .
00000
11110
10110 .
.
00100 .
10100
01000
. 10010
.
10001
.
. . .
. .
.
01011 >3 11101
. .
. . . .
d=2e+1=3; e=1
Spheres of radius 1

26 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Proof of the error correcting theorem

Proof (cont.) Suppose x ∈ Sci ∩ Scj , then d(x, ci ) ≤ e and


d(x, cj ) ≤ e.
Using triangle inequality

d(x, ci ) + d(x, cj ) ≥ d(ci , cj ) ⇒ d(ci , cj ) ≤ 2e

Contradiction as d(ci , cj ) ≥ 2e + 1, so Sci ∩ Scj = ∅.


If t ≤ e errors are introduced and ci transmitted then r ∈ Sci .

• For error detection at least 2e + 1 errors turns a codeword into


another one. Therefore, up to 2e errors can always be detected.

– The case the min. distance is even d = 2e is very similar


(exercise 7)

27 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Example

Example Assume d = 4 is even, and consider the codewords (of


some code)
c1 = (110011) c2 = (001111)
If the received word is (101000) then the decoder cannot decide
whether c1 or c2 was sent.

The received word not in the spheres of radius 1 !

Detection is clearly possible - simply r is not a codeword.

28 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Combining detection and correction

Theorem If C is an [n, M] code with min. distance d. Then C can


correct up to b(d − 1)/2c errors. If used for error detection only, C
can detect d − 1 errors.

• Even case and odd case are different when both correction
and detection are performed !

Theorem If C is an [n, M] code with min. distance d = 2e + 1.


Then C can correct up to e errors but cannot simultaneously
detect additional errors.

Proof Decoder can correct up to e errors (and detect) but if e + 1


errors occurs then Sci → Scj and no detection.

29 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Decoding example
Consider the code C (example 6) with codewords:

c1 = (00000), c2 = (10110), c3 = (01011), c4 = (11101)}

If we would construct the spheres of radius 1 (since d = 3)

Sc1 = {(00000), (10000), (01000), (00100), (00010), (00001)}


Sc2 = {(10110), (00110), (11110), (10010), (10100), (10111)}
Sc3 = {(01011), (11011), (00011), (01111), (01001), (01010)}
Sc4 = {(11101), (01101), (10101), (11001), (11111), (11100)}

The set of vectors that are not in spheres is,

S ∗ = {(11000), (01100), (10001), (00101), (01110), (00111), (10011), (11010)}.

30 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Decoding example II

c1 = (00000), c2 = (10110), c3 = (01011), c4 = (11101)}

• Let r=(00011). Then we compute,

d(c1 , r) = 2, d(c2 , r) = 3, d(c3 , r) = 1, d(c4 , r) = 4,

Decode as c3 .

• Let r = (11000) ∈ S ∗ . Then we compute,

d(c1 , r) = 2, d(c2 , r) = 3, d(c1 , r) = 3, d(c1 , r) = 2.

Cannot decode, the receiver knows there are at least 2 errors.

31 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Decoding - combining correction and detection

c1 = (00000), c2 = (10110), c3 = (01011), c4 = (11101)}

• Last case: Suppose c1 is sent and 2 errors are present so that

r = (10100).

– Receiver decides in favour of c2 (closest) - makes error.

– But cannot detect 2 errors if used at the same time for


error correcting (only one error; distance to c2 is 1).

– Without correcting can detect 2 errors.

32 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Decoding complexity

• Important to design error correcting capability related to a


given application.

• If M is large, say M = 250 it is infeasible to find the closest


codeword ! 106 distance computations/sec gives 20 years for
a single error correction.

• Also computing min. distance ≈ M 2 /2 is infeasible.

• Another issue is the efficiency (rate) of the code - e.g. given n


and d (desired) how do we maximize k ?

• Also given n and k how do we maximize d ?

33 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Shannon’s theorem- Introduction

Assume we toss a coin and want to transmit the information by


tel. wire. Further assumptions:

• Have two different symbols 0 and 1 as our alphabet symbols


• The coin is tossed t times per minute and the channel can
handle 2t tosses per minute.
• Channel is noisy with probability of error
p = Pb(1 → 0) = Pb(0 → 1).

No restriction on the channel but need arbitrary small


probability of error probability after decoding.

Idea: use repetition code of large length N.

34 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Shannon’s theorem- Preparation

• Then if p = 0.001 the decoder makes an error with:

X  
N
Pe = (1 − p)k p N−k < (0.07)N ,
k
0≤k<N/2

thus Pe → 0 for N → ∞.

Problem - can only send 2 symbols for each tossing! SOLUTION?

• YES, one of the greatest result in coding/information theory


due to C. Shannon, 1948.

35 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Shannon’s theorem- Notation

Suppose we use C = {x1 , x2 , . . . , xM }, |xi | = n and the maximum


likelihood decoding.

Lep Pi - the probability of making incorrect decision given xi is


transmitted.
M
1 X
PC := Pi prob. of incorrect decoding of word
M
i=1

• Consider all possible codes with given parameters and define:

P ∗ (M, n, p) := min PC
C

36 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Shannon’s theorem

log2 M
Theorem If the rate R = n is in the range 0 < R < 1 − H(p)
and Mn := 2bRnc then

P ∗ (Mn , n, p) → 0 if n → ∞

Comments: Crucial dependence on p through the binary entropy


function
H(p) = −p log2 p − (1 − p) log2 (1 − p).
– Properties of H:

H(0) = H(1) = 0 and max H(p) = 1 for p = 1/2.


p

– Number of errors in received word is random var. with mean


value np and variance np(1 − p).

37 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Shannon’s theorem - interpretation

– First note that the capacity of a BSC is,

CBSC = 1 − H(p).

Two interesting cases (though rate is fixed):

• p → 0 ⇒ H(p) → 0 ⇒ CBSC → 1. To achieve R ≈ 1 almost


no redundancy (parity bits) as M = 2bRnc ≈ 2n

• p → 1/2 ⇒ H(p) → 1 ⇒ CBSC → 0. To achieve R > 0


redundancy (parity bits) as M is small (few information bits)

– Observe that proof is nonconstructive - no procedure how to


design such a code.

38 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Proof of Shannon’s theorem

OPTIONAL FOR
INTERESTED STUDENTS

39 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Shannon’s theorem - some estimates

w :=– the number of errors in the received word

b := (np(1 − p)/(/2))1/2

Then,
1
P(w > np + b) ≤  Chebyshev’s inequality
2
– Since p < 1/2 then ρ := bnp + bc < n/2 for large n

– If Bρ (x) = {y : d(x, y) ≤ ρ} is a sphere of radius ρ then,


X n 1 n 1 nn
|Bρ (x)| = < n ≤ n ρ
i 2 ρ 2 ρ (n − ρ)n−ρ
i≤ρ

Need some more estimates:)

40 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Shannon’s theorem - some estimates II

ρ ρ
log = p log p + O(n−1/2 )
n n
ρ ρ
(1 − ) log(1 − ) = q log q + O(n−1/2 )(n → ∞)
n n
• Finally need two functions. If u, v, y ∈ {0, 1}n , x ∈ C then

0, if d(u, v) > ρ
f (u, v) =
1, if d(u, v) ≤ ρ
X
gi (y) = 1 − f (y, xi ) + f (y, xj ).
j6=i

FACT: If xi is unique codeword s.t. d(xi , y) ≤ ρ then


gi (y) = 0, and gi (y) ≥ 1 otherwise.

41 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Shannon’s theorem - proof

Proof: We pick the codewords x1 , x2 , . . . , xM at random.

Decoding: If only xi is s.t. d(xi , y) ≤ ρ then decode y as xi ,


otherwise decode as say x1 (max. likelihood decoding) .

• Express Pi using gi ,
X
Pi = P(y|xi )gi (y) (xi is fixed )
y∈{0,1}n
X XX
= P(y|xi ){1 − f (y, xi )} + P(y|xi )f (y, xi ).
y∈{0,1}n y j6=i
| {z }
Pb(y6∈Bρ (xi ))

Using P(w > np + b) = P(w > ρ) ≤ 12  we get (next page)

42 / 46
Mariners Course description Decoding problem Hamming distance Error correction Shannon

Shannon’s theorem - proof II

XXX M
1
Pc ≤  + M −1 P(y, xi )f (y, xi )
2 y i=1 j6=i

• Now we use the fact that P ∗ (M, n, p) < E(PC ), where E(PC )
is expected value over all possible codes C . Hence,

XXX M
∗ 1
P (M, n, p) ≤  + M −1 E(P(y, xi ))E(f (y, xi ))
2 y i=1 j6=i
M XX
X
1 |Bρ |
=  + M −1 E(P(y, xi )) ·
2 y
2n
i=1 j6=i
1
=  + (M − 1)2−n |Bρ |.
2

43 / 46

Mariners Course description Decoding problem Hamming distance Error correction Shannon

Shannon’s theorem - proof III

Finally, we take logs, apply our estimates and divide by n to get,


1
n−1 log(P ∗ (M, n, p) − )
2
≤ n log M − (1 + p log p + q log q) +O(n−1/2 ).
−1
| {z }
R−(1−H(p))<0

This leads to,


1
n−1 log(P ∗ (Mn , n, p) − ) < −β < 0,
2
for n ≥ n0 , i.e. P ∗ (Mn , n, p) < 12  + 2−βn .

44 / 46
30 CHAPTER 2. SHANNON THEORY AND CODING
Chapter 3

Coding theory

Contents of the chapter:


• Decoding
• Shannon

• Vector spaces
• Linear codes
• Generator matrix

• Parity check

31
Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Example

Example Assume d = 4 is even, and consider the codewords

c1 = (110000) c2 = (001100)

If the received word is (101000) then the decoder cannot decide


whether c1 or c2 was sent.

The received word not in the spheres of radius 1 !

Detection is clearly possible - simply r is not a codeword.

Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Combining detection and correction

Theorem If C is an [n, M] code with min. distance d. Then C can


correct up to b(d − 1)/2c errors. If used for error detection only, C
can detect d − 1 errors.
• Even case and odd case are different when both correction
and detection are performed !

Theorem If C is an [n, M] code with min. distance d = 2e + 1.


Then C can correct up to e errors but cannot simultaneously
detect additional errors.

Proof Decoder can correct up to e errors (and detect) but if e + 1


errors occurs then Sci → Scj and no detection.
Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Decoding example
Consider the code C (example 6) with codewords:

c1 = (00000), c2 = (10110), c3 = (01011), c4 = (11101)}

If we would construct the spheres of radius 1 (since d = 3)

Sc1 = {(00000), (10000), (01000), (00100), (00010), (00001)}


Sc2 = {(10110), (00110), (11110), (10010), (10100), (10111)}
Sc3 = {(01011), (11011), (00011), (01111), (01001), (01010)}
Sc4 = {(11101), (01101), (10101), (11001), (11111), (11100)}

The set of vectors that are not in spheres is,

S ∗ = {(11000), (01100), (10001), (00101), (01110), (00111), (10011), (11010)}.

Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Decoding example II
• Let r=(00011). Then we compute,

d(c1 , r) = 2, d(c2 , r) = 3, d(c1 , r) = 4, d(c1 , r) = 2,

Decode as c3 .

• Let r = (11000) ∈ S∗. Then we compute,

d(c1 , r) = 2, d(c2 , r) = 3, d(c1 , r) = 3, d(c1 , r) = 2.

Cannot decode, the receiver knows there are at least 2 errors.


• Suppose c1 is sent and 2 errors are present so that
r = (10100). Receiver decides in favour of c2 (closest).

– But cannot detect 2 errors if used for error correcting.


Without correcting can detect 2 errors.
Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Decoding complexity

• Important to design error correcting capability related to a


given application.

• If M is large, say M = 250 it is infeasible to find the closest


codeword !

• Also computing min. distance ≈ M 2 /2 is infeasible.

• Another issue is the efficiency (rate) of the code - e.g. given n


and d (desired) how do we maximize k ?

• Also given n and k how do we maximize d ?

Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Shannon’s theorem- Introduction

Assume we toss a coin and want to transmit the information by


tel. wire. Further assumptions:

• Have two different symbols 0 and 1 as our alphabet symbols


• The coin is tossed t times per minute and the channel can
handle 2t tosses per minute.
• Channel is noisy with probability of error
p = Pb(1 → 0) = Pb(0 → 1).

No restriction on the channel ⇒ arbitrary small probability of error


decoding.

Idea: use repetition code of large length N.


Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Shannon’s theorem- Preparation

• Then if p = 0.001 the decoder makes an error with:

X  
N
Pe = (1 − p)k p N−k < (0.07)N ,
k
0≤k<N/2

thus Pe → 0 for N → ∞.

Problem - can only send 2 symbols for each tossing! SOLUTION?

• YES, one of the greatest result in coding/information theory


due to C. Shannon, 1948.

Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Shannon’s theorem- Notation

Suppose we use C = {x1 , x2 , . . . , xM }, |xi | = n and the maximum


likelihood decoding.

Lep Pi - the probability of making incorrect decision given xi is


transmitted.
M
1 X
PC := Pi prob. of incorrect decoding of word
M
i=1

• Consider all possible codes with given parameters and define:

P ∗ (M, n, p) := min PC
C
Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Shannon’s theorem

log2 M
Theorem If the rate R = n is in the range 0 < R < 1 − H(p)
and Mn := 2bRnc then

P ∗ (Mn , n, p) → 0 if n → ∞

Comments: Crucial dependence on p through the binary entropy


function
H(p) = −p log2 p − (1 − p) log2 (1 − p).
– Properties of H:

H(0) = H(1) = 0 and max H(p) = 1 for p = 1/2.


p

– Number of errors in received word is random var. with mean


value np and variance np(1 − p).

Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Shannon’s theorem - interpretation

– First note that the capacity of a BSC is,

CBSC = 1 − H(p).

Two interesting cases (though rate is fixed):

• p → 0 ⇒ H(p) → 0 ⇒ CBSC → 1. To achieve R ≈ 1 almost


no redundancy (parity bits) as M = 2bRnc ≈ 2n

• p → 1/2 ⇒ H(p) → 1 ⇒ CBSC → 0. To achieve R > 0


redundancy (parity bits) as M is small (few information bits)

– Observe that proof is nonconstructive - no procedure how to


design such a code.
Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Motivation for linear codes

• A class of codes with nice algebraic structure.

• Not always the best ones but allows for efficient coding and
decoding.

• Additional structural constraints gives families of cyclic and


BCH codes

• Hamming codes are typical representative, but many other


good codes Reed-Muller, Hadamard codes etc.

Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Code as a vector space

Need to formally define the main parameters


• Alphabet A - finite field with q elements, e.g. A = GF (2)
then |A| = 2 or A = GF (p r ) so |A| = p r .
• Message space - the set of all k-tuples over F , denoted
Vk (F ). In total q k messages.
• The message k-tuples embedded into n-tuples, n ≥ k.
Redundancy used in error correction/detection.
• One-to-one correspondence

q k messages ↔ q k n − tuples in Vk (F )

Question: Can we choose q k n-tuples so that they form a k


dim. subspace in Vn (F ) ?
Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Vector spaces-basics
• What is a k-dim. vector subspace S ⊂ Vn (F )?
• Simply, subspace is determined by k linearly independent
vectors in Vn (F )

Example Recall our code C = {00000, 10110, 01011, 11101}.


Then any two vectors in C \ {0} are linearly independent.
E.g. taking as basis c1 = 10110, c2 = 01011 we get C as,

C = a1 c1 + a2 c2 , (a1 , a2 ) ∈ F ; F = GF (22 )

Three different basis (six up to permutation), same code !


• In general, the number of selecting k lin. ind. vectors is

k−1
Y
(q n − 1)(q n − q)(q n − q 2 ) · · · (q n − q k−1 ) = (q n − q i ).
i=0

Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Counting subspaces
• Each k-dimensional subspace contains

k−1
Y
(q k − 1)(q k − q)(q k − q 2 ) · · · (q k − q k−1 ) = (q k − q i )
i=0

ordered sets of k linearly independent vectors.


• The total number of k-dimensional subspaces in Vn (F ) is,
Qk−1
i=0 (q n − q i )
Qk−1
i=0 (q k − q i )

Example In our case q = 2, n = 5, k = 2


k−1
Y 1
Y
(q n − q i ) = (25 − 2i ) = 31 · 30 = 960.
i=0 i=0
Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Counting subspaces II

k−1
Y Y1 Qk−1 n i
k i 2 i i=0 (q − q ) 960
(q −q ) = (2 −2 ) = 3·2 = 6; Qk−1 = = 160.
i=0 (q k − q i ) 6
i=0 i=0

Where does this 6 comes from ?

(10000), (01000) (01000), (10000)


(11000), (01000) (01000), (11000)
(11000), (10000) (10000), (11000)

All gives the same subspace !

Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Basis of a code

• We can select any of the 160 subspaces to construct linear


[5, 2] code C .

• But need a correspondence between subspace and the


message space ?

• Let us select a basis B = {v1 , v2 , . . . , vk } of S (k-dim.


subspace of Vn (F )) and define,
k
X
f : M → S; f (m) = m i vi ,
i=1

where m = (m1 , m2 , . . . , mk ) is a message k-tuple, m ∈ M.


Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Constructing linear code - example

Example Let M = {(00), (10), (01), (11)}


• Define subspace S of V4 (Z2 ) through the basis B = {v1 , v2 },

v1 = (1100) v1 = (0110).

• Then f maps M to S as follows,

(00) → (0000)
(10) → (1100)
(01) → (0110)
(11) → (1010)

Thus S = C = {(0000), (1100), (0110), (1010)}.

Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Selecting a “good” subspace

• Many choices for the subspace (linear code) for fixed n, k. E.g.

B = {(10000), (01000)} ⇒ dC = 1,
B = {(10110), (01011)} ⇒ dC = 3,
B = {(10111), (11110)} ⇒ dC = 2,

• Choose the subspace with largest Hamming distance.


• For fixed k can increase n - more check digits (greater
potential for error correcting). But smaller rate typical
trade-off.

Definition A linear (n, k)-code is a k-dimensional subspace of


Vn (F ).
Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Minimum distance of linear codes

• For a nonlinear [n, M] code computing d requires computing


M

2 Hamming distances. Linear code is easier to handle !

Definition The Hamming weight of v ∈ Vn (F ) is the number


of nonzero coordinates in v, i.e.

w (v ) = #{vi 6= 0, 1 ≤ i ≤ n}

Definition The Hamming weight of an (n, k) code C is,

w (C ) = min{w (x) : x ∈ C , x 6= 0}.

• If C = {(0000), (1100), (0011), (1111)} then the Hamming


distance of the code equals to the Hamming weight of the
code !

Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Hamming weight of a linear code

Theorem: Let d be the distance of an (n, k) code C . Then,

d = w (C )

Proof: By definition, d = min{d(x, y) : x, y ∈ C , x 6= y}. Also

d(x, y) = w (x − y)

But x − y ∈ C (C is subspace) so,

d = min{w (z) : z ∈ C , z 6= 0}

• Computing the distance equivalent to finding codeword with


max number of zeroes !
Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Representing linear codes

• It is of interest (for decoding) to select particular basis.

Example: Let v1 , v2 , v3 be a basis of a (5, 3) code. Define,


   
v1 1 0 0 0 0
G =  v2  =  1 1 0 1 0 
v3 1 1 1 0 1

• If m = (m1 m2 m3 ) ∈ M then

c = mG = m1 v1 + m2 v2 + m3 v3

– E.g. m = (101) then mG = (01101).


• Selecting the basis u1 = (10000), u2 = (01010), u3 = (00111)
(same code) mG 0 = (10111).

Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Generator matrix of code

Definition: A generator matrix G of an (n, k)-code C is a k × n


matrix whose rows are a vector space basis for C .
• Codewords of C = linear combinations of rows of G .
• Generator matrix G not unique - elementary row operations
gives the same code
• Would like to find a generator matrix in standard form,

G = [Ik A]

Ik identity k × k; A − k × (n − k) matrix
– Can we for a given C always find G in a standard form ? NO,
but we can find equivalent code !
Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Equivalent codes

• Main idea: permuting coordinates of the codewords does not


affect Hamming weight !

C = {(0000), (1100), (0011), (1111)}


0
C = {(0000), (0110), (1001), (1111)}

• We can get equivalent code (not necessarily identical) !

Definition Two (n, k)-codes C and C 0 are said to be equivalent if


there exists a permutation matrix P such that G 0 = GP.
• P permutes the columns of G (coordinates of the codewords)

Theorem If C is an (n, k)-code over F then there exists G for C or


for an equivalent code C 0 such that G = [Ik A].

Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Transforming the code - example


• Want to transform C into C 0 (equivalent not identical codes)
   
0 0 1 1 1 0 0 1
G̃ =  0 1 1 0  ; G 0 =  0 1 0 1 
1 0 1 1 0 0 1 0

• Step 1: G̃ → G (add row 1 to rows 2 and 3)


 
0 0 1 1
G̃ =  0 1 0 1 
1 0 0 0
• Step 2: G 0 = GP (interchange columns 1 and 3)
 
0 0 1 0
 0 1 0 0 
P̃ = 
 1

0 0 0 
0 0 0 1
Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Orthogonal spaces
– Define inner product of x, y ∈ Vn (F ),
n
X
x·y = xi yi
i=1

• Remark that x · x = 0 ⇒ x = 0 if x ∈ R. But not the case


when F is a finite field. E.g.

x = (101) ⇒ x · x = 1 + 0 + 1 = 0

Orthogonal vectors if x · y = 0.
Definition Let C be an (n, k) code over F . The orthogonal
complement of C ( dual code of C ) is

C ⊥ = {x ∈ Vn (F ) : x · y = 0 for all y ∈ C }

Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Dual code

Theorem 3.3. If C is an (n, k) code over F then C ⊥ is an


(n, n − k) code over F .

Proof (see the textbook). First show that C ⊥ is a subspace of


Vn (F ), then show that dim(C ⊥ ) = n − k.

• What is a generator matrix of C ⊥ ?

Corollary 3.4 If G = [Ik A] is a generator matrix of C then


H = [−AT In−k ] is a generator matrix of C ⊥ !

Proof We have, GH T = Ik (−A) + AIn−k = 0, i.e. rows of H


orthogonal to rows of G . By definition span(H) = C ⊥
Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Dual code - example

Example Let C be an (6, 3) code defined by,


   
1 0 1 1 0 1 1 0 0 1 1 1
G̃ =  1 1 0 1 0 0  → G =  0 1 0 0 1 1  = [I3 A],
0 1 0 0 1 1 0 0 1 0 1 0

Then,  
1 0 0 1 0 0
H = [−AT I2 ] =  1 1 1 0 1 0 
1 1 0 0 0 1
Check that GH T = 0, and linear independency of rows of H !

Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Parity check matrix

• The condition GH T = 0 essentially means,

c ∈ C ⇔ HcT = 0.

Comes from mG = c after multiplying with H T

Definition If H is a gen. matrix of C ⊥ then H is called a


parity check matrix .

• But also if G is the generator matrix of C then it is parity


check matrix for C ⊥ .
• Easy transformation if standard form,

G = [Ik A] ⇔ H = [−AT In−k ]


Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Parity check matrix II

• Can specify C given H (standard form), but no need to


perform H → G → c = mG !
• Encoding of m = (m1 m2 . . . mk ) (in standard form is mapping
of m to

c = (m1 m2 . . . mk x1 x2 . . . xn−k )

• The xi are called check symbols - they provide redundancy to


detect and correct errors !
• Given m and H the check symbols are determined through,

HcT = 0

Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Parity checks - example

Let C be a (6, 3) code and with the parity check matrix,


 
1 0 1 1 0 0
H= 1 1 0 0 1 0 
0 1 1 0 0 1

• Which codeword encodes the message m = (101) ?


– Depend on the basis of C ! If we prefer standard form
(G = IK A]) then, c = (101x1 x2 x3 ).

• Using HcT = 0 gives,

1 + 1 + x1 = 0 → x1 = 0
1 + x2 = 0 → x2 = 1 ⇒ c = (101011)
1 + x3 = 0 → x3 = 1
Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Parity checks - example

• Easy to determine general equations for xi ,

m1 + m3 = x1
m1 + m2 = x2
m2 + m3 = x3

– Another way of computing the codewords is to use H = [−AT I3 ],


 
1 0 0 1 1 0
G = [I3 A] =  0 1 0 0 1 1 
0 0 1 1 0 1
and c = (101011)

Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Properties of parity check matrix

Theorem Let C be an (n, k) code over F . Every set of s − 1


columns of H are linearly independent iff w (C ) ≥ s.

Proof ⇒ Denote H = [h1 , h2 , . . . , hn ]- hi columns of H.

– Assumption any s − 1 columns of H lin. independent. Then,


n
X
T T
Hc = [h1 , h2 , . . . , hn ]c = ci hi = 0
i=1

– If wt(c) ≤ s − 1, contradiction. Thus, wt(c) ≥ s. Since c is


arbitrary we have w (C ) ≥ s.
Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Properties of parity check matrix II


Proof.
(cont.) ⇐ Assume w (C ) ≥ s and some set of t ≤ s − 1 columns
of H are lin. dependent.
t
X
∃λij ∈ F : λij hij = 0
j=1

– Construct c s.t.

λij 1, ≤ j ≤ t
cij =
0, otherwise

• Legal codeword as HcT = 0, but w (c) = t ≤ s − 1,


contradiction. REMARK: We can compute the distance of the
code in this way !

Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Shannon - proof

OPTIONAL - FOR
INTERESTED STUDENTS
Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Shannon’s theorem - some estimates

w :=– the number of errors in the received word

b := (np(1 − p)/(/2))1/2

Then,
1
P(w > np + b) ≤  Chebyshev’s inequality
2
– Since p < 1/2 then ρ := bnp + bc < n/2 for large n

– If Bρ (x) = {y : d(x, y) ≤ ρ} is a sphere of radius ρ then,


X n 1 n 1 nn
|Bρ (x)| = < n ≤ n ρ
i 2 ρ 2 ρ (n − ρ)n−ρ
i≤ρ

Need some more estimates:)

Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Shannon’s theorem - some estimates II

ρ ρ
log = p log p + O(n−1/2 )
n n
ρ ρ
(1 − ) log(1 − ) = q log q + O(n−1/2 )(n → ∞)
n n
• Finally need two functions. If u, v, y ∈ {0, 1}n , x ∈ C then

0, if d(u, v) > ρ
f (u, v) =
1, if d(u, v) ≤ ρ
X
gi (y) = 1 − f (y, xi ) + f (y, xj ).
j6=i

FACT: If xi is unique codeword s.t. d(xi , y) ≤ ρ then


gi (y) = 0, and gi (y) ≥ 1 otherwise.
Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Shannon’s theorem - proof

Proof: We pick the codewords x1 , x2 , . . . , xM at random.

Decoding: If only xi is s.t. d(xi , y) ≤ ρ then decode y as xi ,


otherwise decode as say x1 .

• Express Pi using gi ,
X
Pi = P(y|xi )gi (y) (xi is fixed )
y∈{0,1}n
X XX
= P(y|xi ){1 − f (y, xi )} + P(y|xi )f (y, xi ).
y∈{0,1}n y j6=i
| {z }
Pb(y6∈Bρ (xi ))

Using P(w > np + b) = P(w > ρ) ≤ 12  we get (next page)

Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Shannon’s theorem - proof II

XXX M
1
Pc ≤  + M −1 P(y, xi )f (y, xi )
2 y i=1 j6=i

• Now we use the fact that P ∗ (M, n, p) < E(PC ), where E(PC )
is expected value over all possible codes C . Hence,

XXX M
∗ 1
P (M, n, p) ≤  + M −1 E(P(y, xi ))E(f (y, xi ))
2 y i=1 j6=i
M XX
X
1 |Bρ |
=  + M −1 E(P(y, xi )) ·
2 y
2n
i=1 j6=i
1
=  + (M − 1)2−n |Bρ |.
2
Decoding Shannon Vector spaces Linear codes Generator matrix Parity check

Shannon’s theorem - proof III

Finally, we take logs, apply our estimates and divide by n to get,


1
n−1 log(P ∗ (M, n, p) − )
2
≤ n−1 log M − (1 + p log p + q log q) +O(n−1/2 ).
| {z }
R−(1−H(p))<0

This leads to,


1
n−1 log(P ∗ (Mn , n, p) − ) < −β < 0,
2
for n ≥ n0 , i.e. P ∗ (Mn , n, p) < 12  + 2−βn .
52 CHAPTER 3. CODING THEORY
Chapter 4

Decoding of linear codes


and MacWilliams identity

Contents of the chapter:


• Reminder
• Hamming

• Group theory
• Standard array
• Weight distribution
• MacWilliams identity

53
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Linear codes - repetition

• Linear code (n, k) is a linear subspace of Vn (A) of dimension


k.

• Specified by the generator matrix G (alternatively parity check


matrix H)
GH T = 0.
• Comes easily from HcT = 0 for any codeword c ∈ C .

• G = [Ik A] in standard form was particularly suitable.

1 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Standard form - repetition

• Could not always find G in standard form by elementary row


operations !
• Examples (better)
   
0 0 1 1 1 0 0 1 1 1
G =  0 1 1 0 1  G0 =  0 1 1 0 1 
0 1 1 1 0 0 0 0 1 0

Solution: Find equivalent code - permutation of columns


allowed.

2 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Main result - reminder

Theorem Let C be an (n, k) code over F . Every set of s − 1


columns of H are linearly independent iff w (C ) > s.

• Special case is s = 3 - no 2 columns of H are linearly


dependent

3 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Constructing single-error correcting code

Need a code with w (C ) = 3


• Previous result states that we need H s.t. no 2 or fewer
columns of H are lin. dependent !
• SOLUTION: Do not use all-zero vector and no column is a
scalar multiple of other column.

– The construction procedure is:


• Find H with no lin. dependency of any two columns (easy)
• For explicit definition of C we need a generator matrix G , i.e.
H → G.
–Special case when code alphabet is binary !

4 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Example of single-error correcting code


Example Want to construct a single-error correcting (7, 4) code ?
• Just ensure there is no repeated columns in H.
• Since G is a 4 × 7 matrix H is a 3 × 7 binary matrix
• Only one option (up to permutation of columns),
 
1 0 0 1 0 1 1
H= 0 1 0 1 1 0 1 
0 0 1 0 1 1 1

• Any other ordering of columns gives equivalent code.


– Can we construct 2-error correcting code in this way ?
YES, but a more complicated procedure (see the textbook).

What about a (7, 5, 3) code ?

5 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Hamming codes

• Single-error correcting codes; easy coding and decoding.

Definition A Hamming code of order r over GF (q) is,


– an (n, k) code
– length of the code is n = (q r − 1)/(q − 1)
– dimension k = n − r
– parity check matrix Hr of size r × n s.t. no 2 columns are
lin. dependent.

• All the codes of min. distance 3; codewords have a maximum


length, i.e. cannot increase the number of columns of H !
Due to n = (q r − 1)/(q − 1) cannot add any more columns to H

6 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Binary Hamming codes

• Specified by a single parameter r .

Definition A binary Hamming code of order r is,


• an (n, n − r ) code
• length of the code is n = 2r − 1
• dimension k = n − r
• parity check matrix Hr of size r × n s.t. all columns are
distinct and nonzero.
• d =3

Setting r = 3 we get a (7,4,3) Hamming code.

7 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Perfect codes

Hamming codes (binary) example of perfect codes (7, 4, 3),


(15, 11, 3) . . .

Definition A perfect code is an e error-correcting [n, M] code over


A such that every n-tuple is in the sphere of radius e of some
codeword.
Example Consider the vector space V7 (2) - a set of binary vectors
of length 7.
– There are 27 = 128 vectors
– Each sphere of radius 1 contains 7+1=8 vectors
– 16 spheres cover the whole space 16 × 8 = 128
– Dimension of the code is k = 4, i.e. Hamming (7,4,3) code

8 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Perfect codes II

• Spheres not only disjoint but exhaust the whole space An !

To see that Hamming codes are perfect observe,


– d(C)=3 thus e = 1; each sphere contains 1 + n(q − 1) vectors

– the number of spheres is

q k = q n−r

(nmb. of codewords)

– so the spheres contain

[1 + n(q − 1)]q n−r = [1 + (q r − 1)]q n−r = q n .

9 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Decoding single-error correcting codes

• Need the concept of an error vector e,

r = c + e,

where r is a received word.


• If H is a parity check of C and r is received then,

HrT = H(c + e)T = Hc T T


|{z} +He = He
T

=0

• We can easily deal with the cases wt(e) 6 1 :

– If e = 0 then HeT = 0 and accept r as transmitted code-


word.
– If wt(e) = 1, say ei = α 6= 0 then HeT = αhi

10 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Decoding procedure (single-error)

H parity check matrix and r the received vector

1. Compute HrT

2. If HrT = 0 accept r as transmitted codeword

3. HrT = sT 6= 0 compare sT with columns of H

4. If sT = αhi for some 1 6 i 6 n then


e = (0, 0, . . . , |{z}
α , 0, . . . , 0); correct r to c = r − e
i

11 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Decoding - example

Let again  
1 0 0 1 0 1 1
H= 0 1 0 1 1 0 1 
0 0 1 0 1 1 1

Is c=(1111111) a valid codeword ?

Assume c=(1111111) is sent and r=(0111111) received

Decode by computing HrT = sT = 100T - sum of last 6 columns


of H.

Correct r ← r + (1000000).

12 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Reminder on group theory

• Group is a set G together with an operation “◦” satisfying:

1. ∀a, b ∈ G : a ◦ b ∈ G Algebraic closure

2. ∀a, b, c ∈ G : a ◦ (b ◦ c) = (a ◦ b) ◦ c Associativity

3. ∃!e ∈ G : ∀a ∈ G : a ◦ e = e ◦ a = a e is identity element

4. ∀a ∈ G , ∃a−1 ∈ G : a ◦ a−1 = a−1 ◦ a = e Inverse element

• (G , ◦) is called Abelian if for all a, b ∈ G , a ◦ b = b ◦ a

13 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Example of Groups

• (Z, +) is a group under usual integer addition. We check,

∀a ∈ Z, a + 0 = a; a + (−a) = 0

• (Z, ·) is not a group as,

3−1 =? i.e. 3 · x = 1 has no solution in Z

• Z∗p = Zp \ 0 = {1, 2, . . . , p − 1} is a group under


multiplication (mod p) iff p is prime.
• For example, (Z∗5 , · (mod 5)) is a group since,

1−1 = 1; 2−1 = 3; 3−1 = 2; 4−1 = 4;

14 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Structure of Groups

• A group G is cyclic if there exists a generator a of the group


s.t.
i times
z }| {
∀g ∈ G , ∃i > 0 : g = a = a ◦ a · · · ◦ a .
i

• 2 is a generator of (Z∗5 , · (mod 5)) since,

20 = 1; 21 = 2; 22 = 4; 23 = 3 (mod 5)

• On the other hand 4 is not a generator as,

40 = 1; 41 = 4; 42 = 1 (mod 5)

15 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Reminder on group theory II


We need the concepts of a subgroup, cosets and Lagrange theorem
• Let G be a group and H ⊂ G . H is called a subgroup of G if
H is itself a group.

Definition Let H be a subgroup of G . The subset,

a ◦ H = {a ◦ h | h ∈ H}

is called the left coset of H containing a.

Theorem [Lagrange] For a subgroup H of G we have #H|#G .

Proof Show that a 6= a0 s.t. a 6∈ a0 ◦ H then (a ◦ H) ∩ (a0 ◦ H) = ∅


and #(a ◦ H) = #H.

16 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Splitting the group into cosets


Group can be viewed as a union of cosets

Example Let G = [(00), (10), (01), (11)] be a group with the group
operation vector addition mod2.

Let H = [(00), (10)] ⊂ G . The cosets of H are,

H + (00) = H H + (01) = [(01), (11)] = H + (11).

Thus G = H ∪ H + (01).

• The idea of standard array decoding is to think of C as a


subgroup of order q k in the group Vn (F ).
• Splitting Vn (F ) into cosets gives a convenient way of
decoding any linear code.

17 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Standard array decoding

Notation (vector addition): 0 is neutral element, inverse of a is −a

• A code C of size q k and length n has t = q n /q k = q n−k


cosets.
• These cosets are denoted C0 , C1 , . . . , Ct−1 , where C0 = C .
• For each Ci let li (a coset leader), 0 6 i 6 t − 1, be a vector
of minimum weight in Ci

IDEA: Construct a q n−k × q k array S where si+1,j+1 = li + cj .

Entries in row i + 1 are elements of Ci and the first column


contains coset leaders.

18 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Standard array - example


For the binary (5, 2) code with generator matrix,
 
1 0 1 0 1
G=
0 1 1 1 0

the standard array is given by,

coset leaders
00000 10101 01110 11011 codewords
00001 10100 01111 11010
00010 10111 01100 11001
00100 10001 01010 11111
01000 11101 00110 10011
10000 00101 11110 01011
11000 01101 01110 00011
10010 00111 11100 01001

19 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Properties of standard array decoding


What about Maximum Likelihood Decoding (nearest neigbour)
strategy ?
• Standard array decoding is in accordance with MLD as,

d(li + cj , cj ) 6 d((li + cj , ch ) ∀cj

– This means that if r = li + cj is received then cj is closest to


r than any other codeword (see Lemma 3.8).
• Two cases to consider:
1. li is unique vector of least weight in Ci - then cj closest to
li + cj , OK.
2. li not unique (more than one vector of least weight) still cj
closest to r than any other ch

20 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Properties of standard array decoding II

Theorem Let C with w (C ) = d. If x is such that


 
d −1
w (x) 6
2

then x is unique element of minimum weight in its coset and


thus a coset leader.

Proof Suppose w (x) 6 b d−1


2 c and there is y : w (y) 6 w (x).
Since x − y ∈ C (there are in the same coset) we have,

d −1 d −1
w (x − y) 6 w (x) + w (y) 6 b c+b c6d −1
2 2
Contradicts the fact w (C ) = d, unless x = y.

21 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Standard array decoding - algorithm

Standard array decoding for linear codes


Precomputation: Construct a standard array S

Let r be a received vector


1. Find r in the standard array S
2. Correct r to the codeword at the top of its column

S will correct any e or fewer errors but also of weight e + 1 if the


pattern appears as a coset leader.

22 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Standard array decoding - example

Assume in previous example of a (5, 2) code that r = (10111)

coset leaders
00000 10101 01110 11011 codewords
00001 10100 01111 11010
00010 10111 01100 11001
00100 10001 01010 11111
01000 11101 00110 10011
10000 00101 11110 01011
11000 01101 01110 00011
10010 00111 11100 01001

23 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Syndrome decoding

Few problems with standard array decoding:


• storing a standard array (e.g. q = 2, n = 40)
• locating the received vector in the table (cannot sort it)

More efficient approach is called syndrome decoding


• The syndrome of x is computed as HxT . Why we do that ?
• It turns out that all the elements in the same coset of C have
the same syndrome !

Theorem Two vectors x, y are in the same coset of C if and only


if they have the same syndrome, i.e. HxT = HyT .

24 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Syndrome decoding - algorithm


Proof(sketch) x, y ∈ Ck ⇒ x = lk + ci ; y = lk + cj . Then,

HxT = H(lk + ci )T = HlT


k = Hy
T

• The main idea is to establish 1-1 correspondence between


coset leaders and syndromes

Syndrome decoding for linear codes


Precomputation: 1-1 one correspondence between coset
leaders and syndromes
Let r be a received vector and H the parity check

1. Compute the syndrome s = HrT of r


2. Find the coset leader l associated with s
3. Correct r to r − l

25 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Syndrome decoding - example


We follow the same example our (5, 2) code C , r = 10111 with,
 
  1 0 1 0 0
1 0 1 0 1
G= ; H =  0 1 0 1 0  ; s = HrT = 010
0 1 1 1 0
1 1 0 0 1

coset leaders syndrome


00000 10101 01110 11011 000
00001 10100 01111 11010 001
00010 10111 01100 11001 010
00100 10001 01010 11111 100
01000 11101 00110 10011 011
10000 00101 11110 01011 101
11000 01101 01110 00011 110
10010 00111 11100 01001 111

Not needed !
26 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Standard array vs. syndrome decoding

• Suppose C is a binary (70, 50) code, then |C | = 250


codewords.
• The number of cosets is 270 /250 = 220 .

– Comparing the two strategies,

Standard array Syndrome


Storage 270 20
2 (70 + 20)
Dec. Computation Search 270 entries Search 220 entries

• But we can further improve the decoding storage.


• Only keep correspondence between syndromes and weights of
coset leaders !

27 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Step-by-step decoding

For our previous example we would have,

Syndrome Weight of coset leaders


000 0
001 1
010 1
100 1
011 1
101 1
110 2
111 2

The algorithm processes r by flipping one bit at a time, and checks


whether the vector is moved to a lighter coset leader.

28 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Step-by-step decoding
Step-by-step decoding for linear codes II

Precomputation: Set up 1-1 one correspondence be-


tween syndromes and weights coset leaders
Let r be a received vector and H the parity check

1. Set i = 1
2. Compute HrT and the weight w of corresponding coset
leader
3. If w = 0, stop with r as the transmitted codeword
4. If H(r + ei )T has smaller associated weight than HrT ,
set r = r + ei .
5. Set i = i + 1 and go to 2.

Read example 27 in the textbook for further understanding how


the algorithm works.
29 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Weight distribution - motivation

• Weight distribution gives a detailed description of the number


of codewords of certain weight in a code.
• For a (non)linear [n, M] code C let

Ai = #{c : w (c) = i, c ∈ C }

• Vector (A0 , A1 , . . . , An ) is called the weight distribution of C .

Two main reasons for studying the weight distribution:

– For determining the probability of incorrectly decoded received


vectors
– For deriving Mac-Williams identity

30 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

q-ary symmetric channel

Assumption is that any symbol from A has the same probability of


being transformed into another symbol.

1-p 1-p
0 0 0 0

p 1 1
p
1 1 . 2
. ..
1-p . p
q-1 .
Binary symmetric channel
q
q
1-p
q-ary symmetric channel

31 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

The probability of error


Assumption: C is an (n, k)-code over F = GF (q) and the zero
codeword is sent.

The probability that some (specified) codeword of weight i is


received is,  i
p
(1 − p)n−i , 0 6 i 6 n
q−1

• Of interest is to compute the probability that an error goes


undetected(codeword goes into another codeword)
n
X  i
p
Ai (1 − p)n−i
q−1
i=1

NOTE: Correct the formulae in the textbook (there summation


goes from i = 0)
32 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

The probability of error II


• If C has distance d = 2t + 1 and incomplete decoding(only
decode if d(r, c) 6 t) is used then
t  
X n
Pb(correct decoding) = p i (1 − p)n−1
i
i=1

– What is the probability if both correction and detection are


used ?
• Define N(i, h, s) as follows:

– No codewords of weight i then N(i, h, s) = 0, otherwise

N(i, h, s) = #{x : w (x) = h & d(x, c) = s for fixed c : w (c) = i}.

– N(i, h, s) independent of the given codeword of weight i


(exercise 98)
33 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Error probability and codeword spheres

.
00001 00110
. .
10111
00010 . .
.
10000 00000 . 11110
10110 .
.
00100
.
10100
01000
. 10010 . Legend

.
11000
. . . Black and blue correctly decoded
. .
01010
. > 3 11101
. . Red points incorrectly decoded
. . . . N(1,2,2)= 0 for c=(00001)

N(1,2,1)=4 { (00011),(00101),(01001),
d=2e+1=3; e=1
(10001)}
Spheres of radius 1

34 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

The probability of decoding error


The number of vectors of weight h at distance s of the codewords
of weight i is
Ai N(i, h, s)

• To get improperly decoded vector it must lie in a sphere of


another codeword of radius t other than that which was sent.
• The probability of receiving a particular vector of weight h is,
 h
p
(1 − p)n−h
q−1

• What does the following expression then relate to ?


 h
p
Ai N(i, h, s) (1 − p)n−h
q−1

35 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

The probability of decoding error II

So if zero codeword is sent the probability of decoding it as some


codeword of weight i is,
n X
X t  h
p
Ai N(i, h, s) (1 − p)n−h
q−1
h=0 s=0

• If i > 1 then a decoding error has occurred. Thus the


probability of a decoding error is,
n X
X n X
t  h
p
Ai N(i, h, s) (1 − p)n−h
q−1
i=1 h=0 s=0

• Again to compute this probability - need weight distribution !

36 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Weight enumerators

Small codes - list the codewords and find weight distribution. E.g.
 
1 1 0 0
G=
1 1 1 1
Then C = {0000, 1100, 0011, 1111} thus A0 = 1, A2 = 2, A4 = 1.

For linear codes we can find out the weight distribution of a


code given the weight distribution of its dual (or vice versa)

Definition Let C be an (n, k)-code over F with weight distribution


(A0 , A1 , . . . , An ). The weight enumerator of C is defined as,
n
X
WC (x, y ) = Ai x n−i y i .
i=0

37 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Weight enumerators II

• For each u ∈ Vn (F ) we define P(u) = x n−w (u) y w (u) . Then,

X n
X
P(u) = Ai x n−i y i = WC (x, y )
u∈C i=0

Example For C = {0000, 1100, 0011, 1111} we can compute

P(0000) = x 4 ; P(0011) = x 2 y 2 ; P(1100) = x 2 y 2 ; P(1111) = y 4

• This formalism is proved useful for deriving MacWilliams


identity
• Identity is valid for any linear code and if e.g. dual code of C
is of small dimension we get its weight distribution and then
obtain weight distribution of C
38 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

MacWilliams identity - preparation (optional)

Only consider q = 2. Easily generalized to A = GF (p k ).


• Define a function,
X
gn (u) = (−1)u·v P(v), u, v ∈ Vn (GF (2))
v∈Vn

Lemma 3.11 If C is a binary (n, k)-code then


X 1 X
P(u) = gn (u).
|C |
u∈C ⊥ u∈C

Proof (sketch) Write


X XX X X
gn (u) = (−1)u·v P(v) = P(v) (−1)u·v
u∈C u∈C v∈Vn v∈Vn u∈C

39 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

MacWilliams identity - preparation II (optional)

Proof (cont.) Easy to verify that,


X 
u·v |C | if v ∈ C ⊥
(−1) =
0 6 C⊥
if v ∈
u∈C

Therefore, X X
gn (u) = |C | P(v).
u∈C v∈C ⊥

The following result is also needed (Lemma 3.12 in the textbook),

gn (u) = (x + y )n−w (u) (x − y )w (u) .

Proved by induction on n !

40 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

MacWilliams identity

Theorem If C is a binary (n, k) code with dual C ⊥ then,

1
WC ⊥ (x, y ) = WC (x + y , x − y ).
2k
Proof Let the weight distribution of C be (A0 , A1 , . . . , An ). Then,
X 1 X
P(u) = gn (u) Lemma 3.11
|C |
u∈C ⊥ u∈C
1 X
= (x + y )n−w (u) (x − y )w (u) Lemma 3.12
|C |
u∈C
Xn
1 1
= Ai (x + y )n−i (x − y )i = WC (x + y , x − y )
|C | |C |
i=0

41 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

MacWilliams identity - example

Assume given is a (6, 3) binary code C with (Ex. 10)


 
1 0 0 1 1 0

G= 0 1 0 0 1 1 
0 0 1 1 0 1

The weight distribution of C is (1, 0, 0, 4, 3, 0, 0). What is the


weight distribution of C ⊥ ?

WC (x + y , x − y ) = (x + y )6 + 4(x + y )3 (x − y )3 + 3(x + y )2 (x − y )4
= . . . = 8x 6 + 32x 3 y 3 + 24x 2 y 4

Then, by MacWilliams identity,


1
WC ⊥ (x, y ) = WC (x + y , x − y ) = x 6 + 4x 3 y 3 + 3x 2 y 4 = WC (x, y )
8
42 / 44
Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Computing the weight distribution

Assume we have a linear (70, 50) code C .


• Cannot compute the probability of incorrect decoding - need
the weight distribution of C .
• But the dual code is a (70, 20) code and from
G → H → C ⊥ = span(H) we can compute the weight
distribution of C ⊥ .
• MacWilliams gives us the weight distribution of C .

• The main question is how to construct good linear codes


(apart from Hamming codes)
• E.g. the code used in Mariner was a Reed-Muller (32, 6) code
of min. distance 16 !

43 / 44

Reminder Hamming Group theory Standard array Weight distribution MacWilliams identity

Conclusions

• Many nice algebraic properties for linear codes (not always the
case for nonlinear codes)

• Connection to dual code

• General decoding strategies: standard array and syndrome


decoding

• Further decoding optimizations possible

44 / 44
76CHAPTER 4. DECODING OF LINEAR CODES AND MACWILLIAMS IDENTITY
Chapter 5

Coding theory -
Constructing New Codes

Contents of the chapter:


• Constructing new codes
• Basic methods for constructions

• Some bounds
• Other construction methods
• Elias codes

77
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Hamming codes and perfect codes - reminder

• Introduced Hamming codes as example of perfect codes

• Perfect codes : Spheres around codewords exhaust the whole


space

• Hamming (binary) code has parameters

(n = 2r − 1, 2r − 1 − r , 3) r > 3

1 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Syndrome decoding - reminder

 
1 0 1 0 1
G= ;
0 1 1 1 0
coset leaders syndrome
00000 10101 01110 11011 000
00001 10100 01111 11010 001
00010 10111 01100 11001 010
00100 10001 01010 11111 100
01000 11101 00110 10011 011
10000 00101 11110 01011 101
11000 01101 01110 00011 110
10010 00111 11100 01001 111

Array not needed !

2 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

MacWilliams identity-reminder

Theorem
If C is a binary (n, k) code with dual C ⊥ then,

1
WC ⊥ (x, y ) = WC (x + y , x − y ).
2k

n
X
WC (x, y ) = Ai x n−i y i .
i=0

Ai weight distribution - number of codewords of weight i.

3 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Introduction

• So far we looked at Hamming codes

• These codes are only defined for some specific lengths, have
certain minimum distance and dimension.

Can we get other codes out of the known ones ?

YES, by using the techniques of puncturing, extending,


taking crossover sections ...

4 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Detecting more errors

• Assume we want to detect 3 errors

• Hamming (7,4,3) code cannot be used - 2 errors can be


detected

• Can we construct a new code from (7,4,3) code that detects 3


errors ?

• YES, slightly worse rate 4/8 instead of 4/7, but possible.

5 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Simple extension - example

Take the Hamming (7,4,3)- 0 ∈ F72 and 7 cyclic shifts of (1101000)


 

 0000000 
 1111111

 


 1101000  0010111

8 words 0110100 8 complements 1001011

 .
.. 
 ..

 
 .

 

1010001 0101110

Add to these codewords one coordinate (extending) as,

ci,8 = ⊕7j=1 ci,j

E.g. (1101000) → (11010001), we get (8,4) code H

6 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Extending codes
Definition
If C is a code of length n over Fq the extended code C is defined
by,
n+1
X
C := {(c1 , . . . , cn , cn+1 )|(c1 , . . . , cn ) ∈ C , ci = 0}
i=1

• Note that the extended code is linear if C is linear (exercise)

• From the Hamming (7,4,3) code we get an (8,4,4) code, i.e.


n + 1 ← n and d + 1 ← d ! Always possible ?

• How is C specified in terms of generator and parity check


matrix ?

7 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Minimum distance of extended codes


– Note that in case of (7,4,3) code we are forced to have both
even and odd weight codewords :

• If only even weight then d(C ) > 4 for a (7,4,3) code C .

• We cannot have only odd weight codewords as adding 2


codewords of odd weight gives a codeword of even weight
(exercise)

• Finally note that for odd weight the parity (extended bit) is 1
- all together we get an (8,4,4) code.

Question : Why we cannot proceed and get (n + i, k, d + i) ?


8 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Another view of the problem


Assume we can do that : What is the consequence on relative
distance,
d
δ= .
n

We would have,

d +i
δ= →1 i → ∞.
n+i
Clearly not possible for arbitrary k and n.

9 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Some intuition

We “know” there is no (9,4,5) code (at least cannot extend (8,4,4)


to get this one)

• Maybe the space was to small.

• But then we can find 16 codewords if the length n = 10, i.e.


(10, 4, 5) code

• Seems logical ain’t it ?


• Visit http://www.codetables.de/ to find out the answer.

10 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Decomposing generator matrix

• Assume C is a binary (n, k) linear code with min. distance d.

• Generator matrix G is an k × n binary matrix

• IDEA: Split G into 2 parts (decompose) and check whether


you can get required d

11 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Decomposing generator matrix - example

Example
Let us consider the existence of (9,4,5) code
 
1 1 . . . 1 1 0 0 . . . 0 0
G=
G1 G2

G1 is a (k − 1) × d and G2 is a (k − 1) × (n − d) binary matrix.

Let d 0 denote the min. distance of the code generated by G2

To each codeword of C2 there correspond 2 codewords of C

At least one of these codewords has weight 6 21 d on the first d


position. Therefore d 0 > 21 d (finish at home for d = 5)

12 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Generator matrix of the extended code

If C is a linear code with generator matrix G then,


 P 
G = G Gn+1 + ni=1 Gi = 0

where Gi denotes the i-th column of G .

For instance the generator matrix of the (7,4,3) code is,


   
1 0 0 1 0 1 1 0
 0 0 1 1 0 1 
0  X  1 
G =  0 0 1 0 1 1  ; Gi = 

1 0 
i
1 1 0 1 0 0 0 1

13 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Parity check matrix of the extended code

If H is a parity check of C then the parity check of C is,


 
1 1 1 ··· 1
 0 
 
 H 0 
H :=  
 .. 
 . 
0
T
Check that cH = 0 or HcT = 0 for all c ∈ C !

If C has an odd minimum distance d then C has minimum dis-


tance d + 1 (all weights and distances even).

14 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Augmenting the code

Simply adding more codewords to the original code.

The most common way is to add 1 to the generator matrix (if 1 is


not already in the code)
 
(a) G
G =
1

• Alternatively, for a binary (n, k, d) code C the augmented


code is,
C (a) = C ∪ {1 + C }
What are the general properties of the augmented code ?

15 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Augmenting the code II

Adding the codewords has the following effect:


• The length n is the same
• Number of codewords (dimension of the code) increases
• Minimum distance decreases in general,

d (a) = min{d, n − d 0 }
where d 0 is the largest weight of any codeword in C

 

 0000000 
 1111111

 

 1101000
  0010111

8 words 0110100 8 complements 1001011

 .. 
 ..

 . 
 .

 

1010001 0101110
16 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Expurgating the code

Definition
Expurgation: Throwing away the codewords of the code.
CAUTION : It can turn a linear code into a nonlinear one. E.g.
throwing away 5 out of 16 codewords of a (7,4,3) code results in a
nonlinear code.

The most common way is to throw away codewords of odd weight


if the number of odd and even weight codewords is equal.

For which codes we have the above situation ?

17 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Expurgating the codewords of odd weight

Facts
If C is a binary (n, k, d) code containing words of both odd and
even weight then (exercise)

|{c ∈ C : wt(c = odd }| = |{c ∈ C : wt(c = even }| = 2k−1

Almost always the case (but not exactly) !

18 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Expurgating the code - example

We throw away the odd weight codewords of a (6,3,3) code


generated by,


 000000

 100111


  
 001101


1 0 0 1 1 1 
101010
G =  0 0 1 1 0 1 ;C =

 011011
0 1 1 0 1 1 


 111100



 010110

110001

The minimum distance of the new (6,2) code is d = 4, i.e. we get


a (6,2,4) code !

19 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Puncturing the code


Puncturing : Inverse process to extension
• Delete one coordinate of the code (suitable) to get C ∗ .

Example
From a (3,2,2) code by puncturing we get a (2,2,1) code,

0 0 0 0 0
0 1 1 0 1
(3, 2, 2) code (2, 2, 1) code
1 0 1 1 0
1 1 0 1 1

Deleting the coordinate has the following effect:


• The length n drops by 1
• Number of codewords (dimension of the code) unchanged
• Minimum distance drops by one (in general)
20 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Shortening the code by taking a cross-section


The operation of throwing out the codewords that all have the
same value in one coordinate and deleting the coordinate position.

For simplicity, say we shorten c1 = 0 in the example below.


 
0 0 1 1  
  0 0 1 1
G= 0 1 1 0 →G =
1 1 0
1 1 0 0

• From the original code we have thrown out all the codewords
that start with one, i.e. c1 = 1.
• Shortening can be seen as expurgating followed by puncturing.

21 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Shortening as expurgating + puncturing

In the previous example we would delete the codewords,




 0000

 0011

 

 0110 000

 

 
0101 011
C= C0 =

 1100 
 110

 

 1111 101



 1010

1001
This is a linear (3, 2, 2) code !

22 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Shortening the code making it nonlinear

What if we keep the codewords that have c1 = 1. In the previous


example we would delete the codewords,


 0000

 0011

 

 0110 100

 

 
0101 111
C= C0 =

 1100 
 010

 

 1111 001



 1010

1001
This is a nonlinear [3, 4, 1] code !

23 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Lengthening the code

Inverse operation to shortening.

Can be viewed as:

1. extending (adding new columns to generator matrix)

2. followed by augmenting (adding rows to the extended


generator matrix)
 
  0 0 1 1
0 1 1
G= → GL =  0 1 1 0 
1 1 0
1 1 0 0

24 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Summary of basic construction methods

Defining the “redundancy” as r = n − k for a binary (n, k) code

• Augmenting: Fix n; increase k; decrease r .


• Expurgating: Fix n; decrease k; increase r .
• Extending: Fix k; increase n; increase r .
• Puncturing: Fix k; decrease n; decrease r .
• Lengthening: Fix r ; increase n; increase k.
• Shortening: Fix r ; decrease n; decrease k.

Apart from these there are several other techniques to construct


new codes from the old ones.

25 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Good and the best codes

For a given alphabet of q symbols and length of the code we can


try to maximize:

• Number of the codewords given a designed minimum distance


(might be a linear or nonlinear code)

• Maximize a dimension of a linear code k for a given minimum


distance (or vice versa)

Even for small parameters hard to find good codes

26 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Good and the best codes -example

Example
A rather complicated construction from the 60’s gave a [10, 38, 4]
code - a good code.

Until 1978 it was believed this was the best possible code for
n = 10, d = 4.

But then [10, 40, 4] was found - the BEST CODE.

27 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Strange language

Example
• Using “strange” language over binary alphabet: 30 letters and
10 decimal digits

• Can use [10, 40, 4] and correct a single error; detect 3

• Problem implementation : coding, decoding ?

• What about linear codes : only k = 5 - 32 codewords for


n = 10, need n = 11

• Wasting the bandbred

28 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Existence of codes - example

Example
We can ask a question: Is there a binary (5,3,3) linear code ?

ANSWER: Prove it by hand (no need for decomposition)!


• Need to construct a binary 3 × 5 generator matrix G so that
any linear combination has weight at least 3 !
   
1 1 1 0 0 1 1 1 0 0
G =  1 0 1 1 1  try e.g. G =  1 0 1 1 1  NO
? ? ? ? ? 0 1 0 ? ?

Any nonzero combination of 3 rows yields some c s.t. wt(c) < 3.

29 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Finding the best codes - example II

So we know there is no (5,3,3) binary code, easy to construct


(5,2,3) code, e.g.
 
1 1 1 0 0
G=
1 0 1 1 1

Can we have more than 4 codewords in a nonlinear code ?

Turns out that we cannot do better, though the upper and lower
bounds say that 4 6 M 6 6.

Thus, out of 16 codewords of wt > 3 we cannot find 5 codewords


s.t. their mutual distance is > 3 !

30 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Lower and upper bounds on codes

Useful measures - know the range of possible :


• number of codewords for given n, d
• upper bound on dimension (for linear codes)
• also lower bound on dimension (number of codewords)

31 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Singleton bound - introduction

Motivation
:

– there is an (n, n, 1) binary code - G = In

– there is an (n, n − 1, 2) binary code - G = [In−1 1]

Can we generalize these results to a (n, n − d + 1, d) code ?

Or why should we only ask for k 6 n − d + 1 and not better!

The Singleton bound shows that this is indeed the best possible,
over any alphabet.

32 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Singleton bound

Theorem
(Singleton) If C is an (n, k, d) code then d 6 n − k + 1.

Proof.
Use projection of the codewords to the first (k − 1) coordinates
• 2k codewords ⇒ ∃c1 , c2 having the same first k − 1 values

• Then d(c1 , c2 ) 6 n − (k − 1) = n − k + 1 thus d 6 n − k + 1

33 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Singleton bound -example

Example
For instance, we cannot have (7,4,5) code but can we construct
(7,4,4) code ?

NO, the codes having d = n − k + 1 exists for some special n, k, q

Quite loose upper bound !

34 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Singleton bound - generalization

Generalization

• Assume we have an [n, M, d] code over Fq and punctures it


d − 1 times.

• The punctured code is an [n − d + 1, M, 1] code, i.e. the M


punctured words are different (can we have d > 1 ?).
• Thus,
M 6 q n−d+1

The bound is quite lose.

35 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

MDS and perfect codes

Codes that meet the Singleton bound, i.e., satisfy k = n − d + 1,


are called Maximum Distance Separable (MDS) codes.

Perfect Codes codes meet the Hamming bound - e.g. the


Hamming codes and two codes discovered by Golay.

Facts
MDS codes and perfect codes are incomparable:
• there exist perfect codes that are not MDS and
• there exist MDS codes that are not perfect.

Each meets an incomparable optimality criterion. The most


famous class of MDS codes is the Reed-Solomon codes.

36 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Hamming bound

Sometimes called sphere packing bound - generalization of a


sphere-packing condition,
e  
X n
|C | (q − 1)i = q n perfect code d = 2e + 1.
i
|i=0 {z }
Vq (n,e)

Theorem
(Hamming bound) If q, n, e ∈ N, d = 2e + 1 then,

|C | 6 q n /Vq (n, e); .

Proof: The spheres Be (c) are disjoint.


37 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Hamming bound - applications

Could construct (n, n − i, i + 1) codes for i = 0, 1

For n = 7 Singleton bound says - no (7,4,5) code.

It says nothing about (7,4,4) code !


Example
For n = 7 the Hamming bound gives,

|C | 6 27 /(1 + 7) = 16

Thus the Hamming (7, 4, 3) code meets the upper bound !

Therefore, no (7,4,4) code !!

38 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Hamming bound - example

Example
• Another example is the UB on M for a [5, M, 3] code
• Applying the Hamming bound we get,

25
|C | = M 6 = 5.3 = 5.
6
Note that Singleton bound (generalized) gives M 6 2n−d+1 = 8.

39 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Hamming vs Singleton bound

1
R

Hamming

Singleton
Gilbert

0 1/2 delta

Upper bounds - binary alphabet

40 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Other construction methods

Apart from already mentioned methods :

• Need something called (u,u + v) construction

• And direct product method


– Why do we need them ?

Commonly used in construction of good codes, and the latter


comes close to Shannon (in the easiest way)

41 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

(u,u + v) construction
In general, let Ci be a binary [n, Mi , di ] code (i = 1, 2) and define,
C : {((u, u + v)|u ∈ C1 , v ∈ C2 }

Example
Take 2 codes given by
   
1 0 1 1 1 1 0 0
G1 = G2 =
0 1 0 1 0 0 1 1

Our codewords would be

(1011 1100)
(1011 0011)
..
.

The length of the code is easy 2n !


42 / 58
What about the dimension and minimum distance ?
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

(u,u + v) construction - properties

Theorem
Then C is a [2n, M1 M2 , d] code, where d := min{2d1 , d2 }

Proof.
Consider (u1 , u1 + v1 ) and (u2 , u2 + v2 ).
1. If v1 = v2 and u1 6= u2 then d > 2d1
2. If v1 6= v2 the distance is (triangle ineq.)

wt(u1 − u2 ) + wt(u1 − u2 + v1 − v2 ) > wt(v1 − v2 ) > d2

43 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

(u,u + v) construction - example

An abstract justification.
Example
Take for C2 an [8, 20, 3] obtained by puncturing a [9, 20, 4] code

What code to use for C1 with respect to d and M ?

Take an (8,7) even weight code as C1 - to increase M !

The construction gives a [16, 20 · 27 , 3] - at present no better code


is known for n = 16, d = 3.

44 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Direct product codes - Motivation

• The method known as the direct product of codes

Applications include:

– getting good codes

– proving the nonexistence of certain linear codes

– deriving a class of asymptotically good codes

45 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Direct product codes - motivation

• Main idea: Collaboration of two (or more) codes


• Efficiently used in the compact disc application to combat the
burst errors.
• More errors then expected can be corrected (more on this in
the last lecture)

Even more important - asymptotically good codes were


constructed by P. Elias in 1954.

One of a few construction that approaches Shannon

46 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Direct product codes

Notation:

– A and B are binary linear (nA , kA , dA ) and (nB , kB , dB ) codes,


respectively

– R a set of all binary nA × nB matrices over GF (2) - vector space


of dimension nB × nA

Example
One basis for the vector space of 2 × 2 matrices is:
       
1 0 0 1 0 0 0 0
0 0 0 0 1 0 0 1

47 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Direct product codes - definition

Definition
The direct product A ⊗ B is the code consisting of all

nA × nB

matrices with the property that:

– each matrix column is a codeword of A and

– each matrix row is a codeword of B.

Kronecker product - linear algebra

48 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Direct product codes - example


Example

 
  
 000   
 0000
 
1 0 1 101 1 1 0 0 1100
GA = A= ; GB = B=
0 1 1 
 011 0 0 1 1 
 0011
 
110 1111

Then for instance the 3 × 4 matrix M ∈ R where,


   
1 1 1 1 1 1 1 1
M =  1 1 0 0  or M (2) =  0 0 1 1 
0 0 1 1 1 1 0 0

212 matrices only 16 = 2kA kB of these satisfy definition. The


codewords corresponding to M and M (2) are:

c = (111111000011) c(2) = (111100111100)


49 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Direct product code - property

Facts
The product code is “clearly” linear :

– all zero matrix ∈ R

– as A and B are linear then any linear combination of the


columns(rows) of C = A ⊗ B is a valid column(row)

What about the minimum distance - i.e. minimum weight of the


codewords?

50 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Direct product code - min. distance

We can show that dC > dA dB


Justification
If A 6= 0 then there is a nonzero row with weight > dB .

Then A has at least dB nonzero columns of weight > dA so

wt(A) > dA dB .

One can show that aT b ∈ R for a ∈ A, b ∈ B.

If wt(a) = dA and wt(b) = dB then wt(aT b) = dA dB

Recall c = (111111000011) c(2) = (111100111100)

51 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Direct product code - property II

Example
T
Let g1A = (101)T and g1B = (1100). Then
   
1 1 1 0 0
T
g1A g1B =  0  · [1 1 0 0] =  0 0 0 0 
1 1 1 0 0

Corresponding codeword is c = (1100|0000|1100)

The basis of the code C = A ⊗ B is the set,


T
giA gjB 1 6 i 6 kA ; 1 6 j 6 kB

giA and gjB are the rows of the generator matrices GA and GB .

52 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Iterative approach
To summarize :

Therefore, C = A ⊗ B is a linear (nA nb , kA kB , dA dB ) code.

Iterative product - a sequence of direct product codes,

C = A(1) ⊗ A(2) ⊗ · · · A(r ) ⊗ . . .

– Idea used by P. Elias utilizing the extended Hamming codes -


simplest approach to get closer to Shannon’s bound - codes when
n→∞

Remark : The part on the construction of Elias codes is optional


- for interested students
53 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Elias codes - preliminaries


m m+1
Start with extended Hamming codes C1 (H2 ) and C2 (H2 ) of
respective length n1 = 2m and n2 = 2m+1
Assumption: Codes used on a BSC with bit error probability p, and
n1 p < 1/2.
Definition
• Define: V1 := C1 and V2 = C1 ⊗ C2

• Let Vi be an (ni , ki ) code


• Let Ei be the expected number of errors per block after
decoding.

Continue in this way:


m+i
• If Vi has been defined then Vi+1 = Vi ⊗ H2

54 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Properties of recursion

Facts
From the definition of recursion we have:

ni+1 = ni 2m+i
ki+1 = ki (2m+i − m − i − 1)

For extended Hamming codes we know that (Example 3.3.4 J. H.


von Lint):
Ei+1 6 Ei2 and E1 6 (n1 p)2 6 1/4
Thus, these codes have the property Ei → 0 when i → ∞.

Can we express ni in terms of m and i ?

55 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Some math - arithmetic sum


5·6
The sum of first 5 integers is 1 + 2 + 3 + 4 + 5 = 2 .
Recursion

i =1 n1 = 2m
i =2 n2V2 = 2m · 2m+1 = 22m+1
i =3 n3V3 = 22m+1 · 2m+2 = 23m+1+2
..
.
1
ni = 2mi+(1+2+...+i−1) = 2mi+ 2 i(i−1)

i−1 
Y 
mi+ 12 i(i−1) m+j +1
ni = 2 ; ki = ni 1−
2m+j
j=0

56 / 58
Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Comments on Elias codes

If Ri = ki /ni denotes the rate of Vi then,


i−1 
Y 
m+j +1
Ri → 1− > 0 for i → ∞
2m+j
j=0

The Elias sequence of codes has the following properties:


• Length n → ∞ but Ri 6→ 0 !
• At the same time Ei → 0
• Elias codes have di = 4i so dni → 0 for i → ∞.
i

One of a few systematic construction that accomplishes the


Shannon’s result !

57 / 58

Constructing new codes Basic methods for constructions Some bounds Other construction methods Elias codes

Conclusions

• Constructing good (best) codes is commonly a serious


combinatorial challenge
• Methods mentioned in this lecture do not exhaust the
possibilities (of course)

• Many open problems

• Elias codes comprise the basic goal in coding theory :


possibility of errorless transmission over a noisy channel

• Of course, the problem of efficient decoding of long codes is


important

58 / 58
Chapter 6

Coding theory - Bounds on


Codes

Contents of the chapter:


• Shannons theorem revisited
• Lower bounds

• Upper bounds
• Reed-Muller codes

107
Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Linear versus nonlinear codes

• Previous lecture: Several good codes (and a few best) was


mentioned

• How can we claim these were good codes ?

• Ideally, the number of codewords meets the upper bound.

• Very rare situations, even for small n - recall [5, 4, 3] code,


UB=5.

• In this case # codewords same for linear and nonlinear code

1 / 30

Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Linear versus nonlinear codes II

Example: the nonexistence of (12, 5, 5) code and can find a


[12, 32, 5] code

Keep in mind - linear codes used in practice (easy encoding and


decoding); nonlinear codes - “combinatorial challenges”

Apart from encoding “strange” languages :)

2 / 30
Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Main goals of coding theory

– A is the alphabet of q symbols; k = logq |C |


• Given k, d, q find an [n, k, d]q code that minimizes n

• Given n, d, q find an [n, k, d]q code that maximizes k

• Given n, k, q find an [n, k, d]q code that maximizes d

• Given n, k, d find an [n, k, d]q code that minimizes q

The last one is not obvious, but empirically good codes are
obtained by reducing q

• Rate of the code R = kn


• Relative distance δ = dn
3 / 30

Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Some families of codes

Some families of codes (binary) :

• Hamming codes: (2r − 1, k = n − r , d = 3) - good rate but


small distance
• Hadamard codes: [n, 2n, n/2] - good distance but (very) small
rate
• Reed-Muller codes: (2r , r + 1, 2r −1 )-good distance but (very)
small rate

Need asymptotically good (family) of codes in the Shannon’s sense


- fixed rate PE → 0

4 / 30
Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Optimal codes

Definition

A(n, d) := max{M : an [n, M, d] code exists }

A code C such that |C | = A(n, d) is called optimal

• Good codes are long (Shannon) - given p of a BSC we can


have Pe → 0, R > 0, when n → ∞
• Number of errors in a received word is np; ⇒ d must grow at
least as fast as 2np to correct errors (d = 2e + 1)
5 / 30

Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Optimal codes II

– Given the rate R we ask how large δ = d/n is (as a function of n)

Useful notation:
Pr n

– Vq (n, r ) := |Br (c)| = i=0 i (q − 1)i - # of sphere of radius r

6 / 30
Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Hamming vs Singleton bound - reminder

1
R

Hamming

Singleton
Gilbert

0 1/2 delta

Upper and lower bounds - binary alphabet

7 / 30

Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Gilbert - Varshamov bound


Almost trivial but powerful lower bound

There is an asymptotic version of the bound concerning n → ∞

• Until 1982 it was believed that R(δ) equals this bound


• Bound was improved for q > 49 using methods of algebraic
geometry (tedious proof)
• Using Shimura curves (modular curves) to construct sequence
of Goppa codes beating the GV bound for α(δ)

– Maximal code - An [n, M, d] code which is not contained in any


[n, M + 1, d] code
8 / 30
Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Gilbert - Varshamov bound II

Theorem
(GV bound) For n, d ∈ N, d 6 n, we have,

A(n, d) > q n /Vq (n, d − 1).

Proof.
• Let the [n, M, d] code C be maximal, i.e. there is no word in
An with distance > d to all words in C

• That is, the spheres Bd−1 (c), c ∈ C cover An

• Means - the sum of their volumes, |C |Vq (n, d − 1) exceeds q n

9 / 30

Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Constructing good long codes

• In the previous lecture we took some codes (extended


Hamming) and constructed C = C1 ⊗ C2 ⊗ · · ·

• Length n → ∞ but Ri 6→ 0 !

• At the same time Ei → 0

• These codes have di = 4i so ndi → 0 for i → ∞.


i

• Method required iteration and usage of direct product codes


(but efficient).

10 / 30
Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

GV as a construction method

Interpretation of the GV bound :

– Start with any c ∈ An and update An ← An \ Bd−1 (c)

– Take a new codeword from An and update

– Proceed as long as there are vectors in An until the code is


maximal

• Method is constructive but there is no structure in the code.


• Gives an exponential time deterministic algorithm and
nonlinear codes

11 / 30

Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Gilbert - Varshamov bound for linear codes

Theorem
(GV bound LC) If n, d, k ∈ N satisfy

Vq (n, d − 1) < q n−k+1

then an (n, k, d) code exists.

Proof.
• Let Ck−1 be an (n, k − 1, d) code. Since,

|Ck−1 |Vq (n, d − 1) = q k−1 Vq (n, d − 1) < q n ,

Ck−1 is not maximal, i.e. ∃x ∈ An : d(x, c) > d, ∀c ∈ Ck−1

• The code spanned by Ck−1 and x is an (n, k, d) (exercise)

12 / 30
Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

GV bound - example II

GV bound for LC is sufficient but not necessary. E.g. we may


want to deduce if there exists a binary (15, 7, 5) code.

Check the GV condition, n = 15, k = 7, d = 5.


4  
X
n−k+1 n
Vq (n, d − 1) < q ⇔ 6< 29 .
i
i=0

Clearly, not satisfied - so the GV bound does not tell us whether


such a code exists !

• There is a linear BCH (cyclic) code with such parameters !

13 / 30

Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Varshamov construction - linear codes


A randomized polynomial time construction.

Algorithm:

1. Pick a k × n matrix G at random


2. Let C = {xG |x ∈ {0, 1}k }

• Claim: With high probability C has 2k distinct elements.

• Furthermore, their pairwise distance is at least d provided that


2k − 1 < 2n /V2 (n, d − 1).

14 / 30
Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Few words on probability

Let us consider binary vectors of length 4.

Probability of randomly selecting vector of weight 1 is


4
Pb(wt(c = 1)) =
16
What is the probability of selecting another vector of weight 1
3 4
Pb(wt(c = 1)) = <
15 16
Conclusion: we may say,

Pb(2 vectors of wt 1) < 2Pb(1 vector of wt 1 )

15 / 30

Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Varshamov construction - proof


Proof.
1. Suffices to show that for every non-zero x,

xG 6∈ B(0, d − 1)

2. Fix x. Then xG is a random vector. It falls in B(0, d − 1)


with prob. V2 (n, d − 1)/2n .

P
3. By union bound (Pb(∪ni=1 Ai ) 6 ni=1 Pb(Ai )), the probability
that there is x such that xG ∈ V2 (n, d − 1) is at most

(2k − 1)V2 (n, d − 1)/2n

4. If this quantity is less than 1 then such a code exists. If this


prob.  1 then we find such a code with higher prob..

16 / 30
Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Is GV bound tight ?

Previous construction claims that random codes approaches GV


bound (asymptotically).

Are we done ? NO, we cannot check it (long codes)

Dominating belief in coding community: GV bound is tight ! Not


exactly true, many counterexamples:

– Hamming codes beat GV

– Cyclic codes beat GV

17 / 30

Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Is GV bound tight - example

Example
Hamming codes specified by n = 2r − 1, k = 2r − 1 − r , d = 3

Need to compute
2  r
X 
2 −1
V2 (n, d − 1) =
i
i=0

and to compare with 2n−k+1 = 2r +1 .


2  r
X 
2 −1 7·6
for r = 3 =1+7+ = 29 6< 16
i 2
i=0

GV bound not satisfied but there exists the Hamming code.

18 / 30
Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Other upper bounds

We are trying to come close to GV bound from above - squeezing


the space in between

Many upper bounds:


• Have seen Singleton and Hamming
• There are Plotkin, Griesmer and plenty of more sophisticated
bounds

• Elias, McEliece et al., Johnson

• Linear Programming bounds (“state-of-the-art” bounds)

19 / 30

Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Plotkin vs Hamming and Singleton bound

alpha

Hamming

Plotkin

Singleton
Gilbert

0 1/2 delta

Upper bounds - binary alphabet

20 / 30
Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Is LP bound tight ?

Example
Back to the same example, we had :
• A(13, 5) 6 512 - Singleton bound

• Claimed that there was a nonlinear [13,64,5] code


(construction)

• LP bound gives A(13, 5) 6 64

• Means that A(13, 5) = 64 !

21 / 30

Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Linear programming bound - asymptotic

1
LP bound
alpha Elias
Hamming

Plotkin

Singleton
Gilbert

0 1/2 delta

22 / 30
Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Reed-Muller codes

• Large class of codes

• Not the same minimum distance for any length as Hamming


codes

• Recall that Mariner used RM code, (32,6,16) code

• Closely related to nonlinearity of Boolean functions

23 / 30

Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Hamming codes - reminder

• Recall that the Hamming code was defined as a single-error


correcting code with parameters n = (q r − 1)/(q − 1) and
k = n − r.

• The parity check of a Hamming code is an r × n matrix.

• For a binary Hamming code q = 2 so that n = 2r − 1,


k = 2r − 1 − r .

• Its parity check consists of all nonzero vectors of length r


(over GF(2)).

24 / 30
Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Introducing Reed-Muller codes II

• Let us define,  
v1
 v2 
 
Br = [Hr , 0] =  .. 
 . 
vr
• The size of B is r × 2r .

Notation: 1 is the row vector of all-ones of length 2r .

25 / 30

Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Reed-Muller code

Definition
The first order Reed-Muller code denoted R(1, r ) is the subspace
generated by the vectors 1, v1 , v2 , . . . , vr .

• Obviously the length of the code is 2r , what are the dimension


and min. distance ?
Note that the generator matrix of R(1, r ) is,
   
1 1
G= =
Br Hr 0

Theorem
R(1, r ) is a (2r , r + 1, 2r −1 ) code.

26 / 30
Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Parameters of 1-st order Reed-Muller code

Proof.
Need to prove the results on dimension and min. distance

Dimension: can find identity r × r as a submatrix of Br and 1 is


clearly independent from other rows. Thus, k = r + 1.

To show that d = 2r −1 one proves that w (c) = 2r −1 for any


c ∈ C \ {1, 0}

– The main idea of the proof (see textbook) is to use:


• the fact that Hr 0 has all distinct r -tuples as its columns;

27 / 30

Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Parameters of 1-st order RM code

Example

 
1 0 0 1 0 1 1 0
H3 0 =  0 1 0 1 1 0 1 0  all vectors ofGF (2)3
0 0 1 0 1 1 1 0

Can c = (11000000) be in the code ?

No, easily checked. Can c = (01100000) be in the code ?

No, . . .

But if c is in the code then it must be of weight 4.

28 / 30
Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

RM versus Hamming codes

We switched dimensions approximately :

ReedMuller Hamming
dimension r +1 2r − r − 1

length 2r 2r − 1

d 2r −1 3

29 / 30

Shannon’s theorem revisited Lower bounds Upper bounds Reed-Muller codes

Reed-Muller code example

Let us construct a first-order Reed-Muller code R(1, 3) for r = 3.


 
1 0 0 1 0 1 1
H3 =  0 1 0 1 1 0 1  all nonzero vectors ofGF (2)3
0 0 1 0 1 1 1
 
1 0 0 1 0 1 1 0
B3 = H3 0 =  0 1 0 1 1 0 1 0  all vectors ofGF (2)3
0 0 1 0 1 1 1 0
 
  1 1 1 1 1 1 1 1
1  1 0 0 1 0 1 1 0 
G= =
 0 1 0 1 1

H3 0 0 1 0 
0 0 1 0 1 1 1 0

30 / 30
Chapter 7

Reed-Muller codes

Contents of the chapter:


• Direct product of RM
• Decoding RM

• Hadamard transform

123
Direct product of RM Decoding RM Hadamard transform

Reed-Muller code example

Let us construct a first-order Reed-Muller code R(1, 3) for r = 3.


 
1 0 0 1 0 1 1
H3 =  0 1 0 1 1 0 1  all nonzero vectors ofGF (2)3
0 0 1 0 1 1 1
 
1 0 0 1 0 1 1 0
B3 = H3 0 =  0 1 0 1 1 0 1 0  all vectors ofGF (2)3
0 0 1 0 1 1 1 0
 
  1 1 1 1 1 1 1 1
1  1 0 0 1 0 1 1 0 
G= =
 0 1 0 1 1

H3 0 0 1 0 
0 0 1 0 1 1 1 0

1 / 34

Direct product of RM Decoding RM Hadamard transform

Reed-Muller code(reminder)

Definition
The first order Reed-Muller code denoted R(1, r ) is the subspace
generated by the vectors 1, v1 , v2 , . . . , vr .

• Obviously the length of the code is 2r , what are the dimension


and min. distance ?
Note that the generator matrix of R(1, r ) is,
   
1 1
G= =
Br Hr 0

Theorem
R(1, r ) is a (2r , r + 1, 2r −1 ) code.

2 / 34
Direct product of RM Decoding RM Hadamard transform

Using Reed-Muller codes in direct product

• Direct product of 2 codes A and B was defined using basis


vectors of 2 codes,
gAT gB
• If A is an (n1 , k1 , d1 ) and B is an (n2 , k2 , d2 ) code then
C = A ⊗ B is a (n1 n2 , k1 k2 , d1 d2 ) code (easy to remember).

Example
Want to construct a (16,9) linear code !

Can we use Reed-Muller codes of the form (2r , r + 1, 2r −1 ) ?

3 / 34

Direct product of RM Decoding RM Hadamard transform

Using Reed-Muller codes in direct product II


Example
Fits perfect for r = 2 we get a (4,3,2) RM code

Then using two such codes (same) in direct product we get a linear
code,
(n1 n2 , k1 , k2 , d1 d2 ) = (16, 9, 4)

• You may say, so what ?

• Go to our favorite website www.codetables.de and check for


n = 16, k = 9

• No better linear codes than (16,9,4) code for fixed n = 16 and


k=9!
4 / 34
Direct product of RM Decoding RM Hadamard transform

Construction example
Example
A (4,3,2) RM code C is easily constructed using,
 
1 1 1 1
G = 1 0 1 0 
0 1 1 0

• Encoding is simple, e.g. m = (011)

 
1 1 1 1
mG = (011)  1 0 1 0  = (1100)
0 1 1 0

5 / 34

Direct product of RM Decoding RM Hadamard transform

Construction example II

Example
What are the codewords of a big (16,9,4) code V = C ⊗ C ? For
instance c1 = (0110) and c2 = (0101) gives,
   
0 0 0 0 0
   1 
cT   (0101) =  0 1 0
1 c2 = 
1 
1   0 1 0 1 
0 0 0 0 0

• Cannot correct 2 errors with such a code but can correct 3


erasures !

6 / 34
Direct product of RM Decoding RM Hadamard transform

Erasure channel
1-p
0 0

p erasure
p
1 1
1-p

Binary erasure channel

• The receiver knows where are the possible errors !

• Later we show we can always correct d − 1 erasures.

7 / 34

Direct product of RM Decoding RM Hadamard transform

Decoding erasures
Example
Given is the received word with 3 erasures for a (16,9,4) code,
 
0 E 0 0
 0 1 0 1 
r=  0

1 0 1 
0 E 0 E

Decoding strategy:
• Correct erasures in each column using the erasure correction
for a (4,3,2) RM code.

• Correct erasures in each row using the erasure correction for a


(4,3,2) RM code.
8 / 34
Direct product of RM Decoding RM Hadamard transform

Decoding erasures -example

Example
Correcting erasures in columns gives
 
0 E 0 0
 0 1 0 1 
r=  0 1

0 1 
0 E 0 0

Cannot correct 2nd column but now correct rows:


 
0 0 0 0
 0 1 0 1 
r= 0

1 0 1 
0 0 0 0

9 / 34

Direct product of RM Decoding RM Hadamard transform

Decoding Reed-Muller codes

We know how to construct an R-M code but can we efficiently


decode ?

Need to introduce few concepts:

• Proper ordering
• Hadamard matrices
• Hadamard transform

The decoding process turns out to be quite simple - just a matrix


vector multiplication !

10 / 34
Direct product of RM Decoding RM Hadamard transform

Hadamard matrix

Definition
A Hadamard matrix Hn is an n × n matrix with integer entries +1
and -1 whose rows are pairwise orthogonal as real numbers.

Example
The matrix  
1 1
1 −1
is a Hadamard matrix.

11 / 34

Direct product of RM Decoding RM Hadamard transform

Hadamard matrix - history

Dates from the mid of 19-th century. Many applications:

• Combinatorial theory

• Quantum cryptography (complex Hadamard matrices),


Boolean functions
• Measuring the spectrum of light etc.

12 / 34
Direct product of RM Decoding RM Hadamard transform

Hadamard conjecture

Facts
Hadamard conjectured that such a matrix of size 4k × 4k could
be constructed for any k !

Hard to disprove as the smallest order for which no construction is


known is 668!

Easy case 4k = 2u - Sylvester construction.

 
H H
H −H T

13 / 34

Direct product of RM Decoding RM Hadamard transform

Properties of Hadamard matrices

Hadamard matrices of order 1,2, and 4 are,


 
  1 1 1 1
1 1  1 −1 1 −1 
H1 = [1] H2 = H4 = 

1 −1 1 1 −1 −1 
1 −1 −1 1

• Equivalent definition: n × n ±1 matrix such that,

Hn HnT = nIn .

14 / 34
Direct product of RM Decoding RM Hadamard transform

Properties of Hadamard matrices II

Example
    
1 1 1 1 2 0
H2 H2T = =
1 −1 1 −1 0 2

Hn has the following properties,

1. HnT = nHn−1 thus HnT Hn = nIn - columns orthognal as well

2. Changing rows (columns) again Hadamard matrix

3. Multiplying rows(columns) by (-1) again Hadamard matrix

15 / 34

Direct product of RM Decoding RM Hadamard transform

Sylvester construction

Hadamard matrices of order 2r easily constructed recursively,


 
Hn Hn
H2n =
Hn −Hn

Consider again H2 and H4 ,


 
  1 1 1 1
1 1  1 −1 1 −1 
H2 = H4 = 
 1

1 −1 1 −1 −1 
1 −1 −1 1

• Useful in decoding R-M code; recall that the length of an R-M


code is 2r
16 / 34
Direct product of RM Decoding RM Hadamard transform

Proper ordering of binary vectors

Simply binary representation of integers with the leftmost bit as


the least significant bit. For example,

12 = 0011 = 0 · 20 + 0 · 21 + 22 + 23

Formally the proper ordering Pr of binary r -tuples defined


recursively for 1 6 i 6 r − 1,

P1 = [0, 1]
if Pi = [b1 , . . . , b2i ] then Pi+1 = [b1 0, . . . , b2i 0, b1 1, . . . , b2i 1]

17 / 34

Direct product of RM Decoding RM Hadamard transform

Proper ordering - example

Example
• Binary triples would be ordered as,

P3 = [000, 100, 010, 110, 001, 101, 011, 111]

Take n = 2r and u0 , u1 , . . . un−1 ∈ GF (2)r in proper order

• Construct H = [hij ]; 0 6 i, j 6 n − 1 where

hij = (−1)u·v ” · ” − dot product

18 / 34
Direct product of RM Decoding RM Hadamard transform

Hadamard matrix - an alternative view

Example
Let r = 2. Then,
 
1 1 1 1
 1 −1 1 −1 
H4 = 
 1

1 −1 −1 
1 −1 −1 1

E.g. h12 = (−1)(100)·(010) = (−1)1·0+0·1+0·0 = −10 = 1

19 / 34

Direct product of RM Decoding RM Hadamard transform

Introducing Hadamard transform


Example
Consider r = 3 and r = (11011100).

Any single u of length r picks up a component of r.

r(|{z}
110 ) = 1 picks up the 4-th component of r
u

r(000) = 1 takes 1-st component etc. (counting from 1)

Then define

R((110)) = (−1)r(110) = (−1)1 = −1.

Continuing R = (−1, −1, 1, −1, −1, −1, 1, 1)

20 / 34
Direct product of RM Decoding RM Hadamard transform

Introducing Hadamard transform II


Important tool for encoding/decoding, studying Boolean functions
etc.

IDEA: From a binary r -tuple obtain a real scalar R(u) using,

r
u ∈ Fr2 , r ∈ F22 → r(u) → R(u) = (−1)r(u)
Alternatively, the mapping r → R is defined as

0 7→ 1 1 7→ −1

21 / 34

Direct product of RM Decoding RM Hadamard transform

Hadamard transform

Definition
The Hadamard transform of the 2r -tuple R is the 2r -tuple R̂ where
for any u ∈ Fr2 X
R̂(u) = (−1)u·v R(v).
v∈Fr2

Using R(v) = (−1)r(v) we get,


X
R̂(u) = (−1)u·v+r(v) .
v∈Fr2

Essentially we measure the distance to linear (Boolean) functions !

22 / 34
Direct product of RM Decoding RM Hadamard transform

Hadamard transform - example


Example
Given r = (11011100) we want to compute R̂(110), i.e. u = (110)

X
R̂(110) = (−1)(110)·v+r(v)
v∈F32

= (−1)(110)·(100)+r(100) + (−1)(110)·(010)+r(010)
= (−1)(110)·(001)+r(001) + (−1)(110)·(110)+r(110) + · · ·
= (−1)1+1 + (−1)1+1 + (−1)0+0 + (−1)0+1 + · · · = 6

• Need to compute 7 more values for other vectors u


• Alternatively, R̂ can be defined as (exercise 25), R̂ = RH,
where H is a Hadamard matrix of order 2r !

23 / 34

Direct product of RM Decoding RM Hadamard transform

Computing Hadamard transform - example

Example
For r = (11011100) we have computed
R = (−1, −1, 1, −1, −1, −1, 1, 1). Then,
 T    T
−1 1 1 1 1 1 1 1 1 −2
 −1   1 −1 1 −1 1 −1 1 −1   2 
     
 1   1 1 −1 −1 1 1 −1 −1   −6 
     
 −1   1 −1 −1 1 1 −1 −1 1   −2 
RH8 = 





=
 


 −1   1 1 1 1 −1 −1 −1 −1   −2 
 −1   1 −1 1 −1 −1 1 −1 1   2 
     
 1   1 1 −1 −1 −1 −1 1 1   2 
1 1 −1 −1 1 −1 1 1 −1 −2

We do not get 6 as in the previous example but -2. Proper


ordering is important !

24 / 34
Direct product of RM Decoding RM Hadamard transform

Computing Hadamard transform - explanation

The first computation was performed using,

 
1 0 0 1 0 1 1 0
B3 = H3 0 =  0 1 0 1 1 0 1 0  all vectors ofGF (2)3
0 0 1 0 1 1 1 0

25 / 34

Direct product of RM Decoding RM Hadamard transform

Decoding 1-st order RM codes

Ordered columns of Br are used to associate the distinct binary


r -tuples with the coordinate positions in r, R, R̂.

Theorem
(Main theorem) R̂(u) is the number of 0’s minus the number of
1’s in the binary vector,
r
X
t=r+ ui vi
i=1

where u = (u1 , u2 , . . . , ur )T and vi is the i-th row of Br .

26 / 34
Direct product of RM Decoding RM Hadamard transform

Developing the decoding procedure

• The previous result allows us to include the Hadamard values


for measuring the minimum distance. The number of 0’s in
Pr
t=r+ i=1 ui vi is,
r
X r
X
r r r
t0 = 2 − w (t) = 2 − w (r + ui vi ) = 2 − d(r, ui vi )
i=1 i=1
Pr
• Then obviously t1 = d(r, i=1 ui vi ).

• Now using R̂(u) = t0 − t1 we have,


r
X
R̂(u) = 2r − 2d(r, ui vi )
i=1 27 / 34

Direct product of RM Decoding RM Hadamard transform

Developing the decoding procedure II

1. Another way to compute t0 is,


r
X r
X
t0 = w (1 + t) = w (1 + r + ui vi ) = d(r, 1 + ui vi )
i=1 i=1
Pr
2. Then t1 = 2r − d(r, 1 + i=1 ui vi ) so that
r
X
R̂(u) = 2d(r, 1 + ui v i ) − 2 r
i=1
,
3. Finally,
decoding formulas
z }| {
r
X r
X
1 r 1 r
d(r, ui vi ) = (2 − R̂(u)) ; d(r, 1 + ui vi ) = (2 + R̂(u))
2 2
i=1 i=1
28 / 34
Direct product of RM Decoding RM Hadamard transform

Deriving the decoding procedure

r
Suppose r ∈ F22 is a received vector. Our goal is to decode r to
the codeword closest to r.

Facts
Pr
• For any binary r -tuple u = (u1 , . . . , ur ), uBr = i=1 ui vi .

• An (r + 1) message tuple can be viewed as m = (u0 , u) where


u0 ∈ {0, 1} and u ∈ Fr2
The transmitted codeword is,
  r
X
1
c = mG = (u0 , u) = u0 · 1 + ui vi .
Br
i=1

29 / 34

Direct product of RM Decoding RM Hadamard transform

Connection to encoding

Decoding formulas considers two cases u0 = 0 and u0 = 1 !

Finding c closest to r = find c minimizing d(r, c)

Exactly what is given by decoding formulas,


r
X X r
1 1
d(r, ui vi ) = (2r − R̂(u)) ; d(r, 1 + ui vi ) = (2r + R̂(u)),
2 2
i=1 i=1
Pr Pr
for c = i=1 ui vi and c = 1 + i=1 ui vi !

Computing R̂(u) for all u ∈ Fr2 ⇔ dc∈C (r, c)

30 / 34
Direct product of RM Decoding RM Hadamard transform

Decoding RM codes

From the decoding formula dc∈C (r, c) is minimized by u which


minimizes,
min{2r − R̂(u), 2r + R̂(u)}
u

• We are looking for u that maximizes the magnitude of R̂(u) !


• Assuming we have found a unique u maximizing |R̂(u))|, we
have 2 cases,
 Pr
 i=1 ui vi ; R̂(u)) > 0, u0 = 0
c=
 Pr
1+ i=1 ui vi ; R̂(u)) < 0, u0 = 1
31 / 34

Direct product of RM Decoding RM Hadamard transform

Decoding algorithm for RM(1, r )

INPUT: r a received binary vector of length 2r ; Br with columns


in the proper ordering Pr ; H a Hadamard matrix H = H(2r ).

1. Compute R = (−1)r and R̂ = RH

2. Find a component R̂(u) of R̂ with max. magnitude, let


u = (u1 , . . . , ur )T

Pr
3. If R̂(u) > 0, then decode r as i=1 ui vi

Pr
4. If R̂(u) < 0, then decode r as 1 + i=1 ui vi

32 / 34
Direct product of RM Decoding RM Hadamard transform

Decoding RM(1, r ) - example

Example Let B3 for a RM(1, 3) be in proper order,


   
0 1 0 1 0 1 0 1 v1

B3 = 0 0 1 1 0 0 1 1  =  v2  all vectors ofGF (2)3
0 0 0 0 1 1 1 1 v3

The corresponding generator matrix is,


 
1 1 1 1 1 1 1 1
 0 1 0 1 0 1 0 1 
G = 0 0

1 1 0 0 1 1 
0 0 0 0 1 1 1 1

For a received r = (01110110) need to compute R and R̂ !

33 / 34

Direct product of RM Decoding RM Hadamard transform

Decoding RM(1, r ) - example cont.

Easy to compute R = (−1)r = (1, −1, −1, −1, 1, −1, −1, 1). Also,

R̂ = RH = (−2, 2, 2, 6, −2, 2, 2, −2).

Thus, |R̂(u)| = 6 and u = (110)T .

Since R̂(110) = 6 > 0 then,


3
X
c= ui vi = 1 · v1 + 1 · v2 + 0 · v3 = (011000110)
i=1

Given R̂ and decoding formula we can find distance to each of the


codewords. E.g. R̂(000) = −2 so d(r, 0) = 5 and d(r, 1) = 3
34 / 34
Chapter 8

Fast decoding of RM codes


and higher order RM codes

Contents of the chapter:


• Fast Hadamard transform
• RM codes and Boolean functions

• Self-Dual codes

141
Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Decoding complexity

Recall that the main decoding step is to compute,

R̂ = RH

Complexity of decoding = computing the vector matrix product.

Example
Mariner was using RM(1,5) code to correct up to 7 errors.
Received vectors of length 25 = 32 are multiplied with H(25 )
requiring c.a. 22r +1 = 211 operations (back in the 70’s).

What if r is large, can we do better ?

1 / 26

Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Decoding complexity II

• YES, we utilize the structure of the Hadamard matrix to


reduce the complexity to r 2r .

Definition
For A = [aij ] and B = [bij ] of order m and n respectively define the
Kronecker product as mn × mn matrix,

A × B = [aij B]

So we are back again to product codes.

2 / 26
Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Decoding erasures revisited

• Our (16,9,4) code could only correct 1 error !


• Assume we could correct this error using RM decoding. Then
number of computations is:

16 × (16 + 15) ≈ 16 × 32 = 512 = 29

where blue denotes multiplications and green additions.


• Using our decoding approach we only need to check rows (or
columns) which gives,

Nmb. of operations = 4 × (4 × (4 + 3)) = 112

3 / 26

Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

The Kronecker product


The Hadamard matrix H(2r ) can be viewed as,

H(2r ) = H2 × H2 × · · · × H2
| {z }
r times

Example
 
  1 1 1 1
1 1  1 −1 1 −1 
H2 = H4 = 
 1

1 −1 1 −1 −1 
1 −1 −1 1

A useful property of the Kronecker product is (Lemma 4.3),

(A × B)(C × D) = AC × BD
4 / 26
Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

The fast Hadamard transform


IDEA Split the computation into chunks (called butterfly
structure)

Theorem
For a positive integer r ,
(1) (2) (r )
H(2r ) = M2r M2r · · · M2r ,
(i)
where M2r = I2r −i × H2 × I2i−1 .

It turns out that less operations are required for computing


(1) (2) (r )
RM2r M2r · · · M2r then RH(2r ) directly !

5 / 26

Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Decomposition - example

For r = 2 we need to show that,


(1) (2)
H4 = M4 M4

where,
   
(1) 1 0 1 1
M4 = I2 × H2 × I1 = ⊗ ⊗ [1] =
0 1 1 −1
 
1 1 0 0
 1 −1 0 0 
I2 × H2 = 
 0

0 1 1 
0 0 1 −1

6 / 26
Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Decomposition - example II

 
    1 0 1 0
(2) 1 1 1 0  0 1 0 1 
M4 = H2 × I2 = ⊗ =
 1

1 −1 0 1 0 −1 0 
0 1 0 −1

(1) (2)
Finally we confirm below that M4 × M4 = H4
    
1 1 0 0 1 0 1 0 1 1 1 1
 1 −1 0  
0  0 1 0 1   1 −1 1 −1 
 = 
 0 0 1 1   1 0 −1 0   1 1 −1 −1 
0 0 1 −1 0 1 0 −1 1 −1 −1 1

7 / 26

Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Computing M matrices
 
1 1
 1 −1 
 
 1 1 
 
(1)  1 −1 
M8 = I4 × H2 × I1 = 

···

 1 1 
 1 −1 
 
 1 1 
1 −1
 
1 1
 1 1 
 
 1 1 
 
(3)  1 1 
M8 = H2 × I4 = 
 1


 −1 
 1 −1 
 
 1 −1 
1 −1
8 / 26
Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Computing fast Hadamard transform - example

In the previous example we had


R = (−1)r = (1, −1, −1, −1, 1, −1, −1, 1) and

R̂ = RH(23 ) = (−2, 2, 2, 6, −2, 2, 2, −2).

Computing via M matrices gives,


(1)
RM8 = (0, 2, −2, 0, 0, 2, 0, −2)
(1) (2)
(RM8 )(M8 ) = (−2, 2, 2, 2, 0, 0, 0, 4)
(1) (2) (3)
(RM8 M8 )M8 = (−2, 2, 2, 6, −2, 2, 2, −2)

• Many zeros in M matrices yields more efficient computation.


9 / 26

Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Comparing the complexity for Hadamard transform

We compute the total number of operations for the two cases.


(1) (r )
RH(2r ) RM2r · · · M2r
Multipl./column 2r 2
Addition/column r
2 −1 1
Total 2 (2 + 2r − 1) ≈ 22r +1
r r 3r 2r

The complexity ratio is thus,


3r 2r 3r
r r r
= r +1
2 (2 + 2 − 1) 2 −1

For the RM(1,5) code (Mariner) the decoding requires 3r 2r = 480


operations; standard array need the storage for 232 /26 = 226 coset
leaders.

10 / 26
Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

RM codes through Boolean functions

• Boolean functions map n binary inputs to a single binary


output.
• More formaly f : Fn2 → F2 maps (Fn2 = GF (2)n )

(x1 , . . . , xn ) ∈ Fn2 7→ f (x) ∈ F2

• f : Fn2 → F2 can be represented as a polynomial in the ring

F2 [x1 , . . . , xn ]/ < x12 = x1 , . . . , xn2 = xn >

• This ring is simply a set of all polynomials with binary


coefficients in n indeterminates with property that xi2 = xi .

11 / 26

Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Truth table -Example

The truth table of f is the evaluation of f for all possible inputs.


Example
E.g. for f (x1 , x2 , x3 ) = x1 x2 + x2 x3 + x3

x3 x2 x1 f (x)
0 0 0 0
0 0 1 0
0 1 0 0
0 1 1 1
1 0 0 1
1 0 1 1
1 1 0 0
1 1 1 1

Important 1, x1 , x2 , x3 spans the RM(1,3) code !!

12 / 26
Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Boolean functions-definitions

• This may be formalized further by defining,


X X
f (x) = ac x c = ac x1c1 x2c2 · · · xncn , c = (c1 , . . . , cn )
c∈Fn2 c∈Fn2

• Thus f is specified by the coefficients ac

• There are 2n different terms x1c1 x2c2 · · · xncn for different c’s. As
n
ac is binary it gives 22 different functions in n variables.

Example
For n = 3 there are 28 = 256 distinct functions specified by ac ,

B3 = {a0 1⊕a1 x1 ⊕a2 x2 ⊕a3 x3 ⊕a4 x1 x2 ⊕a5 x1 x3 ⊕a6 x2 x3 ⊕a7 x1 x2 x3 }


13 / 26

Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Higher order Reed-Muller codes


All the codewords of RM(1, r ) are of weight 2r −1 , apart from 0
and 1. Affine Boolean functions in r variables

• Generalization - t-th order Reed-Muller code

1, x1 , . . . , xr , x1 x2 , . . . , xr −1 xr , . . . , x1 · · · xt , . . . , xr −t+1 · · · xr
| {z } | {z } | {z }
linear terms quadratic terms degree t

• The dimension of the basis is,


     
r r r
k =1+ + + ··· +
1 2 t

All the vectors are linearly independent.


14 / 26
Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Higher order Reed-Muller codes- example

We consider RM(2,3) code.

x3 x2 x1 1 x1 x2 x1 x3 x2 x3
0 0 0 1 0 0 0
0 0 1 1 0 0 0
0 1 0 1 0 0 0
0 1 1 1 1 0 0
1 0 0 1 0 0 0
1 0 1 1 0 1 0
1 1 0 1 0 0 1
1 1 1 1 1 1 1

Seven basis vectors - (8,7,2) code (recall uniqueness). 128


codewords out of 256 binary vectors of length 8 !

15 / 26

Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Constructing higher order Reed-Muller codes

Given an RM(t, r ) code how do we construct an RM(t + 1, r + 1) ?

RM(t + 1, r + 1) = {(u, u + v ) : u ∈ RM(t + 1, r ), v ∈ RM(t, r )}

In terms of generating matrices this is equivalent to:


 
G (t + 1, r ) G (t + 1, r )
G (t + 1, r + 1) =
0 G (t, r )

To prove this we need an easy result on Boolean functions,

f (x1 , . . . , xr , xr +1 ) = g (x1 , . . . , xr ) + xr +1 h(x1 , . . . , xr )

for some g , h (decomposition).

16 / 26
Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Constructing higher order Reed-Muller codes II

E.g.,
f (x1 , x2 , x3 ) = x1 + x2 + x1 x3 + x1 x2 + x1 x2 x3
= x1 + x2 + x1 x2 +x3 (x1 + x1 x2 )
| {z } | {z }
g (x1 ,x2 ) h(x1 ,x2 )

x3 x2 x1 1 x1 x2 x1 x3 x2 x3 g (x) h(x) f (x)


0 0 0 1 0 0 0 0 0 0
0 0 1 1 0 0 0 1 1 1
0 1 0 1 0 0 0 1 0 1
0 1 1 1 1 0 0 1 0 1
1 0 0 1 0 0 0 0
1 0 1 1 0 1 0 1
1 1 0 1 0 0 1 0
1 1 1 1 1 1 1 0

17 / 26

Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Proving the result on higher order RM

RM(t + 1, r + 1) = {(u, u + v ) : u ∈ RM(t + 1, r ), v ∈ RM(t, r )}

• Codeword from RM(t + 1, r + 1)- evaluation of polynomial


f (x1 , . . . , xr +1 ) of degree 6 t + 1

• Decomp. f (x1 , . . . , xr +1 ) = g (x1 , . . . , xr ) + xr +1 h(x1 , . . . , xr )


implies deg(g ) 6 t + 1 and deg(h) 6 t

• Need the association between functions and vectors !

18 / 26
Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Proving the result on higher order RM II

• g, h vectors of length 2r ↔ g (x1 , . . . , xr ), h(x1 , . . . , xr )

g (x1 , . . . , xr ) xr +1 h(x1 , . . . , xr ) ∈ F2 [x1 , . . . , xr +1 ]


l l l
(g, g) (0, h) vectors of length 2r +1

Thus f = (g, g) + (0, h) = (g, g + h)

19 / 26

Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Important results on higher order RM codes

What about the minimum distance of RM(t, r ), t > 1 ?

Recall: For t = 1 we had d(C ) = 2r −1 ! Generalization ?

Theorem RM(t, r ) has minimum distance 2r −t .

Proof Fix t and use induction on r . What is a typical codeword of


RM(t, r + 1) ? Details of the proof - exercise.

Another important result is given by,

Theorem The dual code of RM(t, r ) is RM(r − t − 1, r ).

20 / 26
Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Dual code proof (optional)

Proof Take a ∈ RM(r − t − 1, r ) and b ∈ RM(t, r ).

Alternatively we may consider a(x1 , . . . , xr ) with deg(a) 6 r − t − 1


and b(x1 , . . . , xr ) with deg(b) 6 t. Then deg(ab) 6 r − 1.

Thus, ab ∈ RM(r − 1, r ) and has even weight, ab ≡ 0 mod 2 !

Therefore, RM(r − t − 1, r ) ⊂ RM(t, r )⊥ . But,


dim RM(r − t − 1, r ) + dim RM(t, r )
       
r r r r
= 1+ + ··· + +1+ + ··· + = 2r
1 r −t −1 1 t

So RM(r − t − 1, r ) = RM(t, r )⊥ (dim RM(t, r )⊥ = 2r − dim RM(t, r ))

21 / 26

Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Self-Dual codes - Motivation

• Very interesting class from theoretical and practical point of


view.
The main properties:

• In some cases easily decodable

• Additional algebraic structure

• Problem of classification of these codes and finding the


minimum weight for a given length

No constructive method for finding these codes, different


constructions for each n, e.g. the Golay code.

22 / 26
Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Self-orthogonal codes

Definition
A linear code C is self-orthogonal if C ⊂ C ⊥ .
Each codeword orthogonal to every other codeword !

Example
The matrix,  
1 0 0 1 0
G=
1 0 1 1 1
generates a self-orthogonal code C . One can check that ci cj = 0.

• But more importantly GG T = 0 !! Coincidence ?

23 / 26

Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Self-orthogonal codes

Theorem
(Lemma 4.5) A linear code C is self-orthogonal iff GG T = 0

Proof.
(sketch) Assume C ⊂ C ⊥ . Let ri be a row of G . Then,

ri ∈ C ; C ⊂ C ⊥ ⇒ ri ∈ C ⊥

Since G is a parity check of C ⊥ then G riT = 0. As this is true for


any ri so GG T = 0.

24 / 26
Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Self-dual codes

Definition
A linear code C is self-dual if C = C ⊥

A self-dual code is clearly self-orthogonal but the converse need


not be true.

E.g. (5,2) code in the previous example cannot be self-dual. Why ?

25 / 26

Fast Hadamard transform RM codes and Boolean functions Self-Dual codes

Self-dual codes - example

The generator matrix,


 
1 1 1 1 1 1 1 1
 1 0 0 1 1 0 1 0 
G =
 0

1 0 0 1 1 1 0 
0 0 1 1 0 1 1 0

defines a binary self-dual (8, 4) code C .

For binary codes needs to check that GG T = 0 and n = 2k

Lemma If G = [Ik B] for a self-dual (n, k) code C then BB T = −Ik

26 / 26

You might also like