IntroToCBC_Chapter1
IntroToCBC_Chapter1
m : a message to be sent
C(·) : an encoder
e : noise introduced in the communication
D(·) : a decoder
m = D (C(m) + e) .
One of the earliest coding scheme for detecting errors in computers is the
use of a parity check digit.
For example, the ASCII (American Standard Code for Information Inter-
change) coding system uses a byte (binary 8-tuple) to represent characters
used in early computers.
1
Although there are 256 (= 28 ) possible values for a byte, there are only
128(= 27 ) ASCII characters.
Only seven bits in a byte is needed. The extra bit in the binary 8-tuple is
called a parity check digit which is used for error detecting.
Example 1.1. The ACSII codewords representing A, B and C are
2
Let v = (v1,1 . . . v1,r kv2,1 . . . v2,r k . . . kvn,1 . . . vn,r ) be the received message
with noise introduced during the communication, i.e. v = C(x) + e.
For each 1 ≤ i ≤ n, denote Zi as the number of vi,j such that vi,j = 0
counting over all j’s; and Oi as the number of vi,j such that vi,j = 1 counting
over all j’s. Then w0 = D(v) = (w1 , . . . , wn ) where
(
1 if Oi > Zi ;
wi =
0 if Zi > Oi .
Remark 1.4. What if Oi = Zi ? For this case, it is beyond the error cor-
recting capacity for the repetition code (of n−1
2 ). More on this will be
discussed later.
In general, the decoding for repetition codes is slow and inefficient. The
main attraction of the repetition code is the ease of implementation.
In general coding theory, we want to construct encoder and decoder for
Definition 1.8. A q-ary block code (or simply, code) of length n over A is
a nonempty subset C of An . An element of C is called a codeword of C.
3
Definition 1.9. The number of codewords in C, denoted by |C| is called the
logq |C|
size of C. The information rate of C is defined to be . A code of
n
length n and size M is called an (n, M )-code.
Example 1.10. For the ASCII code with parity check digit:
Let Zq ∼
= Z/qZ be an additive group using the modulo q addition.
4
Definition 1.12 (Hamming metric). The Hamming norm wtH (x) of a vector
x is defined as the number of nonzero coordinates. The Hamming distance
between x and y ∈ An is the norm of the difference: dH (x, y) = wtH (x − y).
Let Fqm be a finite field with q m elements, where q is a power of prime. Con-
sider the notation ha1 , . . . , am iFq as Fq -linear span of the elements a1 , . . . , am .
In fact, Fqm is an Fq -linear space with dimension m. In particular, there
exist m elements β1 , . . . , βm ∈ Fqm such that {β1 , . . . , βm } forms a basis for
Fqm , i.e. Fqm = hβ1 , . . . , βm iFq . We will omit the notation Fq is the context
is clear.
5
Definition 1.17 (Rank Support). Let x = (x1 , . . . , xn ) ∈ Fnqm . The support
of x, Supp(x) is an Fq -vector space spanned by elements x1 , . . . , xn , i.e.
Supp(x) = hx1 , . . . , xn i.
Definition 1.18 (Rank metric). Let x = (x1 , . . . , xn ) ∈ Fnqm , the rank
weight of x is defined to be the the dimension of the support of x, i.e.
wtR (x) = dim (Supp(x)) .
Alternatively, let {β1 , . . . , βm } be a basis of Fqm . For each 1 ≤ i ≤ n, we can
write xi as Fq -linear combination of the basis, i.e. there exist cij ∈ Fq such
that
Xm
xi = cji βj .
j=1
6
Definition 1.20 (Minimum distance). Let C be a code of length n over A
with C 6= {0}. The minimum distance of C, denoted by d(C), is
Note that this definition is applicable on the Hamming, Lee and rank metric.
A code of length n, size M and distance d is referred as an (n, M, d)-code.
7
c0 is received. Since c0 is a codeword, one does not aware the existence of
errors. So C is not d-error-detecting. Therefore, C is exactly (d − 1)-error
detecting.
d−1
that a codeword c0 is sent
Suppose and
d−1 the
d
word x is received with at most
2 errors, that is dH (c 0 , x) ≤ 2 < 2 For any c ∈ C such that c 6= c0 ,
.
d = dH (C) ≤ dH (c0 , c)
≤ dH (c0 , x) + dH (x, c)
d
< + dH (x, c)
2
d d
⇒ dH (x, c) > d − = > dH (c0 , x).
2 2
By the minimum distance decoding rule, the receiver
d−1 decodes x back to c0
and hence the errors are corrected. So C is 2 -error-correcting.
On the other hand, since dH (C) = d, there exist codewords c, c0 ∈ C such
that dH (c, c0 ) = d. Without loss of generality, assume that c and c0 differ in
exactly the first d digits, i.e.
c = c1 . . . cd wd+1 . . . wn
| {z } | {z }
differ with c0 agree with c0
Let v = d−1
2 . Note that 2v < d ≤ 2(v + 1). If c is sent, there is a chance
that by incurring v + 1 errors, the word
8
If one error occurred during the transmission, the receiver can always correct
it using the minimum distance decoding rule.
If there are two errors, then the receiver can detect the existence of errors
but may not be able to decode it correctly.
If there are three or more errors, then the receive may not detect the existence
of errors; and even he could detect the existence of errors he would decode
it wrongly.
Remark 1.27. The linear code defined in Definition 1.26 is endowed with
the Hamming metric. Analogously, the definition can be extended into rank
metric and Lee metric. We will introduce the definition whenever it is nec-
essary.
a1 c1 + a2 c2 + . . . + ak ck for some a1 , . . . , ak ∈ Fq .
9
Proof. By Definition 1.20, there exists x, y ∈ C such that dH (C) = dH (x, y).
By Definition 1.12, dH (x, y) = wtH (x − y). Then
since x − y ∈ C as C is linear.
On the other hand, by Definition 1.28, there exists z ∈ C such that wtH (C) =
wtH (z). Then
Example 1.30. Determine a basis, the size and the minimum distance for
the following codes:
• The distance of a linear code is equal to the minimum weight and hence
is easier to be determined.
• The encoding and decoding for a linear code are simpler and faster
than those for an arbitrary code.
10
Definition 1.31 (Dot Product). For x = (x1 , . . . , xn ), y = (y1 , . . . , yn ) ∈
Fnq , the dot product of x and y is defined as
n
X
T
x · y = xy = x i yi .
i=1
Definition 1.32 (Dual Code). Let C ⊆ Fnq be a linear code. The dual code
C ⊥ is defined to be
Remark 1.33. The dot product over a finite field is not a true inner product.
It does not satisfy the following axiom of inner product:
There does not exist an ordering relation “<” in a finite field. Also, there
exists nonzero vectors u such that u · u = 0. For instance, 111 ∈ F33 and
111 · 111 = 0.
Also, C ∩ C ⊥ may be non-trivial and C + C ⊥ may not be equal to Fnq (for
example C = span{1100, 0011} ⊆ F42 ).
In other words, if C is an [n, k]-linear code over Fq , then the dual code C ⊥ is
an [n, n − k]-linear code over Fq .
11
On the other hand, for x ∈ Fnq ,
c1 c1 · x
c2 c2 · x
x ∈ C ⊥ ⇔ GxT =
T
x = = 0.
.. ..
. .
ck ck · x
For x ∈ Fnq ,
k
X
x∈C ⇔x= ai gi for some ai ∈ Fq
i=1
Xk
⇔x= ai (gi1 , . . . , gin ) for some ai ∈ Fq
i=1
g11 g12 . . . g1n
g21 g22 . . . g2n
⇔ x = (a1 , . . . , ak ) for some ai ∈ Fq
.. .. . . ..
. . . .
gk1 gk2 . . . gkn
⇔ x = aG for some a ∈ Fkq .
12
Definition 1.34 (Generator Matrix). The matrix G defined above is called
a generator matrix for C.
Proof. Since G is a generator matrix for C, for all x ∈ C, there exists a ∈ Fkq
such that x = aG. Let H be generator matrix for C 0 , for all y ∈ C 0 , there
exists b ∈ Fn−k
q such that y = bH. We want to show that C 0 = C ⊥ . Then
y · x = bH(aG)T
T Ik
= b[−X | In−k ] aT
XT
= b0(n−k)×k aT = 0.
13
Let h1 , . . . , hn−k be the rows of H, note that {h1 , . . . , hn−k } is a basis for
C ⊥ . For any x ∈ C, hi · x = hi xT = 0 for all i. Then
h1 xT
h1
h2 T
T h2 x
T
Hx = . x = = 0(n−k)×1 .
. .
.
. .
hn−k hn−k x T
dim(null(H)) = nullity(H)
= n − rk(H)
= n − dim(C ⊥ )
= n − (n − dim(C)) = dim(C).
Therefore C = null(H).
Theorem 1.5. Let C be a linear code over Fq with parity check matrix H.
Then
(ii) dH (C) < d if and only if there exist d−1 columns of H that are linearly
dependent.
Proof. Since (i) and (ii) are logically equivalent, it suffices for us to show the
statement (ii).
Let H = k1T . . . knT where kiT is the ith column of H. For any x =
(x1 , . . . , xn ) ∈ Fnq , suppose that wtH (x) = t and the nonzero coordinates
are at the position i1 , . . . , it , i.e. xj = 0 if and only if j 6∈ {i1 , . . . , it }. Then
x1 n
HxT = k1T . . . knT ... =
X X
xj kjT = xj kjT .
(⇒) Suppose that dH (C) = wtH (C) = t < d. Then there exists a codeword
x = x1 . . . xn ∈ C of weight t and the nonzero coordinates are at i1 , . . . , it .
14
X
Since HxT = 0, we have HxT = xj kjT = 0, and hence kiT1 , . . . , kiTt
j∈{i1 ,...,it }
are linearly dependent. By adding in any d − 1 − t other columns of H, we
have d − 1 linearly dependent columns of H.
(⇐) Suppose that H has d − 1 linearly dependent columns {kiT1 , . . . , kiTd−1 },
d−1
X
i.e. aj kiTj = 0(n−k)×1 where not all aj ’s are zero.
j=1
Choose a word x = x1 . . . xn such that xij = aj for 1 ≤ j ≤ d − 1 and xs = 0
d−1
X
otherwise. Then HxT = aj kiTj = 0(n−k)×1 . Hence x ∈ C by Theorem 1.4.
j=1
Since wtH (x) ≤ d − 1, we have dH (C) = wtH (C) ≤ wtH (x) ≤ d − 1 < d.
Corollary 1.6. Let C be a linear code with a parity check matrix H. Then
dH (C) = d if and only if
Example 1.39. Let C be a binary [5, 2]-linear code with a parity check
matrix
1 1 0 0 0
H = 1 0 1 1 0 .
1 0 1 0 1
H has no zero column ⇒ dH (C) > 1.
No two columns are multiple of each other, which means that any 3 − 1
columns of H are linearly independent ⇒ dH (C) ≥ 3.
There exists 4 − 1 columns of H that are linearly dependent, i.e. 3rd, 4th
and 5th columns of H ⇒ dH (C) < 4.
Therefore dH (C) = 3, C is 1-error-correcting and 2-error-detecting.
15