coding theory project
coding theory project
by
ARYA FRANCIS
DB20CMSR11
Department Of Mathematics
Don Bosco Arts And Science College
Angadikadavu, Iritty
March 2023
Examiners:
1.
2.
CERTIFICATE
ARYA FRANCIS
DB20CMSR11
ACKNOWLEDGEMENT
I’d also like to thank my friends and parents for their support
and encouragement as I worked on this assignment.
INTRODUCTION 1
PRELIMINARY 3
2 LINEAR CODE 15
2.1 Linear code . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Two Important Subspace . . . . . . . . . . . . . . . . . 15
2.3 Independence, Basis, Dimension . . . . . . . . . . . . 16
2.4 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Bases for C=<S> and C⊥ . . . . . . . . . . . . . . . . . . 19
2.6 Generating Matrices and Encoding . . . . . . . . . . . 21
2.7 Parity Check Matrices . . . . . . . . . . . . . . . . . . . 22
2.8 Distance of Linear Code . . . . . . . . . . . . . . . . . . 23
2.9 Cosets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.10MLD for Linear Code . . . . . . . . . . . . . . . . . . . . 26
CONCLUSION 28
BIBLIOGRAPHY 29
INTRODUCTION
Coding theory is the study of the properties of codes and their re-
spective fitness for specific applications. Codes are used for data
compression, cryptography, error detection and correction, data
transmission and data storage. Codes are studied by various scien-
tific disciplines—such as information theory, electrical engineering,
mathematics, linguistics, and computer science— for the purpose
of designing efficient and reliable data transmission methods. This
typically involves the removal of redundancy and the correction or
detection of errors in the transmitted data.
1
code was developed in 1949. It is an error-correcting code capable
of correcting up to three errors in each 24-bit word, and detect-
ing a fourth.
2
PRELIMINARY
Binary Number
A binary number is a number expressed in the basis-2 numerical
system or binary number system, a method of which uses only two
symbols: typically "0" and "1".
Binary Addition
Binary addition is the sum of two or more binary numbers. Binary
addition rules is,
0+0=0
0+1=1
1+0=1
1+1=0
Probability
Probability is the likelihood that an event will occur and is calcu-
lated by dividing the number of favourable outcomes by the total
number of possible outcomes.
Linear Combination
Let V be a vector space and S is non empty subset of V. A vector x in
V is said to be a linear combination of elements of S if there exist a
finite number of elements y1 ,y2 ,.....,yn in S and scalars α1 ,α2 ,.....,αn
in F such that x=α1 y1 +α2 y2 +......+αn yn
3
Span
Let S be a non-empty subset of a vector space V, the set of all linear
combination of S is called Span of S. It is denoted by [S] or Span(S).
Subspace
A subset W of a vector space V over a field F is called a subspace of
V if W is a vector space over F under the operation of addition and
scalar multiplication defined on V.
Subset
A set A is a subset of another set B if all element of the set A are
element of the set B.
Dimension
Let β be a basis of a vector space V if the number of vectors in β is
n then the vector space V is called n-dimensional vector space and
written as dim(V)=n.
4
Rank
The number ’r’ with the following two properties is called the Rank
of the matrix.
Cosets
Coset is subset of mathematical group consisting of all the products
obtained by multiplying fixed element of group by each of elements
of given subgroup, either on right or on left. Cosets are basic tool
in study of groups
5
CHAPTER 1
INTRODUCTION TO CODING
THEORY
6
The most important part of diagram is noise because without it
there would be no need for coding theory.
7
011011001, then the word received are in order 011,011,001.
If p is the probability that the digit received is the digit sent and
1-p is the probability that the digit received is not the digit sent.
Then the following diagram shows how BSC operates.
Remarks
8
1.3 Information Rate
The addition of digits to codeword may be improve error correction.
1
n
log2 |c| is the information rate of a code is the number that is
designs measure the proportion of each codeword.The information
rate ranges between 0 and 1.
107
11
108
. 11
= 0.1 words per second
are transmitted incorrectly without being detected. That is one
wrong word every 10 seconds, 6 a minute, 360 an hour, or 8640 a day!
p=1-10−8 → 66
1016
Now approximately
9
66 107
1016 12
= 5.5×10−9
10
Example 1.5.1. wt(110101)= 4
Eg: d(01011,00111)=2
Note
The distance between v and w is same as the weight of error pattern.
That is
d(v, w) = wt(v+w).
11
2. Decoding: A word w in kn is received. Now we proceed MLD,
for decoding which word v in c was sent.
where L(v) all word which are close to v. The higher the probability
is, the more correctly the word can be decoded.
12
occurred.
Example 1.8.1. Let C={001, 101, 110} for the error pattern u=010.
We calculate v+010 for all v in C.
001+010=011, 101+010=111, 110+010=100
None of the three words 011, 111 or 100 is in C, so C detects the error
pattern 010. On the other hand, for the error pattern u= 100,
001+100=101, 101+100=001, 110+100=010
Since at least one of these sums is in C, C does not detect the er-
ror pattern 100.
d(000,v+u)=d(000,010)=1 and
d(111,v+u)=d(111,010)=2
d(000,v+u)=d(000,101)=2
d(111,v+u)=d(111,101)=1
13
d(000,v+u)=d(000,110)=2 and
d(111,v+u)=d(111,110)=1
Since v+u is not closer to v=000 than to 111. C does not correct
the error pattern 110.
14
CHAPTER 2
LINEAR CODE
Example 2.1.1. C = {000, 111} is a linear code, since all four of the sums.
000+000=000
000+111=111
111+000=111
111+111=000
are in C. But C1 = {000, 001, 101} is not a linear code, since 001 and
101 are in C1 but 001+101 is not in C1 .
15
If S is empty, we define <S>= {0}.
In linear algebra it is shown that for any subset S of a vector space V,
the linear span <S> is a subspace of V, called the subspace spanned
or generated by S.
Example 2.2.1. Let S = {0100, 0011, 1100}. Then the code C =<S>
generated by S consists of
a1 v1 +a2 v2 +......+ak vk =0
16
Note
Any Linearly independent set B is automatically a basis for <B>.
Also since any linearly independent set S of vectors that contains a
nonzero word always contains a largest independent subset B, we
can extract from S a basis B for <S>. If S={0} then we say that the
basis of S is the empty set Q.
Theorem
Qk−1 k 2.3.3. A linear code of dimension k has precisely
1 i
k! i=0 (2 − 2 ) different bases.
Example
Q3 2.3.1. The linear code k4 and hence
1 4 i 1 4 4 4 2 4 3
4! i=0 (2 − 2 )= 4! (2 -1)(2 -2)(2 -2 )(2 -2 )= 840 different bases.
n
Any linear code contained in k , for n≥4 which has dimension 4 also
has 840 different bases.
2.4 Matrices
An m×n matrix is a rectangular array of scalars with m rows and n columns.
If A is an m × n matrix and B is an n×p matrix, then the product
AB is the m×p matrix which has for its (i,j)th entry.
1 0 1
1 0 1 1 0
1 1
= 1 0 0
0 1 0 1 1 0 1 1 1 1
1 0 0
17
Two matrices are row equivalent if one can be obtained from the
other by a sequence of elementary row operators.
A1 in a matrix M (over K) is called a leading 1 if there are no 1s to its
left in the same row, and a column of M is called a leading column
if it contains a leading 1. M is in Row Echelon Form (REF) if the
zero rows of M (if any) are all at the bottom, and each leading 1 is
to the right of the leading 1s in the rows above.
If further, each leading column contains exactly one 1, M is in Re-
duced Row Echelon Form (RREF).
Example 2.4.1. Find the REF for the matrix M below using elemen-
tary row
operation.
1 0 1 1
1 1 0 1
M= 1 1 1 1
1 0 0 0
1 0 1 1
0 1 1 0
⇒
0
(add row 1 to row 2, row 3 and row 4)
1 0 0
0 0 1 1
1 0 1 1
0 1 1 0
⇒
0
(add row 2 to row 3)
0 1 0
0 0 1 1
1 0 1 1
0 1 1 0
⇒
0
(add row 3 to row 4)
0 1 0
0 0 0 1
18
1 0 1 1
0 1 1 0
0 0 1 0
0 0 0 1
Example 2.4.2. Find the RREF for the matrix M below using elemen-
tary
row operation.
1 0 1 1
M= 1 0 1 0
1 1 0 1
1 0 1 1
→ 0 0 0 1(add row 1 to row 2 and to row 3)
0 1 1 0
1 0 1 1
→ 0 1 1 0(interchange row 2 and 3)
0 0 0 1
1 0 1 0
→ 0 1 1 0(add row 3 to row 1)
0 0 0 1
Algorithm 2.5.1. Form the matrix A whose rows are the words in S.
Use elementary row operations to find a REF of A. Then the nonzero
19
rows of the REF form a basis for C =<S>.
The algorithm works because the rows of A generate C and elemen-
tary row operations simply interchange words or replace one word
(row) with another in C giving a new set of codewords which still gen-
erates C. Clearly the nonzero rows of a matrix in REF are linearly in-
dependent.
Example 2.5.1. We find a basis for the linear code C=<S> for
S = {11101,
10110,
01011, 11010}
1 1 1 0 1
1 0 1 1 0
A= 0 1 0 1 1
1 1 0 1 0
1 1 1 0 1
0 1 0 1 1
→
0
(add row 1 to row 2 and to row 4)
1 0 1 1
0 0 1 1 1
1 1 1 0 1
0 1 0 1 1
→
0
(interchange row 3 to row 4)
0 1 1 1
0 1 0 1 1
1 1 1 0 1
0 1 0 1 1
→
0
(add row 2 to row 4)
0 1 1 1
0 0 0 0 0
1 1 1 0 1
0 1 1 0 0
0 0 1 1 1
0 0 0 0 0
20
So {11101, 01100, 00111} is also a basis for C=<S >. Note that Al-
gorithm 2.5.1 does not produce a unique basis for <S>, nor are the
words in the basis necessarily in the given set S.
Algorithm 2.5.2. Form the matrix A whose rows are the words in S.
Use elementary row operations to place A in RREF. Let G be the k×n
matrix consisting of all the nonzero rows of the RREF. Let X be the
k×(n-k) matrix obtained from G by deleting the leading columns of G.
Form an n×(n-k) matrix H as follows:
21
Example 2.6.1. We find a generator matrix for the code
C={0000,1110,0111,1001}. Using Algorithm 2.5.1,
0 0 0 0 1 1 1 0 1 1 1 0 1 1 1 0
1 1 1 0 → 0 1 1 1 → 0 1 1 1 → 0 1 1 1
A=
0 1 1 1 1 0 0 1 0 1 1 1 0 0 0 0
1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
1 1 1 0
so G= is a generator matrix for C. By Algorithm 2.5.2,
0 1 1 1
1 0 0 1
0 1 1 1 1 0 0 1
since the RREF of A is ,G= is also a gener-
0 0 0 0 1 0 1 1 1
0 0 0 0
ator matrix for C.
22
4. GH=0
Theorem 2.7.4. H is a parity-check matrix of C if and only if HT is
a generator matrix for C⊥
Example 2.7.1. We find a parity check matrix for the code
C={0000,1110,0111,1001} of Example 2.6.1. There we found that
10 01
G1 = = I X
01 11
is a generator matrix for C which is in RREF. By Algorithm 2.5.2, we
connect H
01
X 11
H= =
I 10
01
is a parity check matrix for C. Note that vH= 00 for all words v in C.
23
2.9 Cosets
If C is a linear code of length n, and if u is any word of length n,
we define the coset of C determined by u to be the set of all words
of the form v+u as v ranges over all the words in C. We denote this
coset by C+u. Thus,
C + u ={v+u|v ∈ C}.
and
24
8. The code C itself is one of its cosets.
C= { 0000,1011,0101,1110}
C + 1000 ={ 1000,0011,1101,0110}.
C + 0010 = {0010,1001,0111,1100}
2n−k = 24−2 = 22 = 4
25
2.10 MLD for Linear Code
Let C be a linear code. Assume the codeword v in C is transmitted
and the word w is received, resulting in the error pattern u = v + w.
Then w + u = v is in C, so the error pattern u and the received word
w are in the same coset of C by (3) of Theorem 2.10.1.
Since error patterns of small weight are the most likely to occur,
here is how MLD works for a linear code C. Upon receiving the word
w, we choose a word u of least weight in the coset C + w (which must
contain w) and conclude that v = w + u was the word sent.
Example 2.10.1. Let C={0000, 1011, 0101, 1110}. The cosets of C
(Example 2.10.2) are
26
Theorem 2.10.1. Let C be a linear code of length n. Let H be a
parity-check matrix for C. Let w and u be words in Kn .
27
CONCLUSION
Our aim was to take a note on coding theory by its breath of cov-
erage. Coding theory is the study of properties of codes and their
respective fitness for specific applications. Codes are used for data
compression, cryptography, error detection and correction, data
transmission and data storage. Codes are studied by various scien-
tific disciplines such as information theory, electrical engineering,
mathematics, linguistics and computer science-for the purpose of
designing efficient and reliable data transmission methods. This
typically involves the removal of redundancy and the correction or
detection of errors in the transmitted data. This project work helps
us to know more about coding theory.
I have much pleasure in conveying my heart full thanks to my teach-
ers and colleagues.
28
BIBLIOGRAPHY
29