0% found this document useful (0 votes)
8 views

coding theory project

The document is a project report on Coding Theory submitted by Arya Francis for a Bachelor of Science degree at Kannur University. It covers various aspects of coding theory, including error detection and correction, linear codes, and the mathematical foundations underlying these concepts. The report includes acknowledgments, a declaration of originality, and a detailed table of contents outlining the structure of the project.

Uploaded by

chandu93152049
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

coding theory project

The document is a project report on Coding Theory submitted by Arya Francis for a Bachelor of Science degree at Kannur University. It covers various aspects of coding theory, including error detection and correction, linear codes, and the mathematical foundations underlying these concepts. The report includes acknowledgments, a declaration of originality, and a detailed table of contents outlining the structure of the project.

Uploaded by

chandu93152049
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

CODING THEORY

Project report submitted to


KANNUR UNIVERSITY

for the award of the degree of


BACHELOR OF SCIENCE

by

ARYA FRANCIS
DB20CMSR11

under the guidance of


Ms. Ajeena Joseph

Department Of Mathematics
Don Bosco Arts And Science College
Angadikadavu, Iritty
March 2023

Examiners:
1.
2.
CERTIFICATE

This is to certify that "Coding Theory" is a bona fide project of


ARYA FRANCIS DB20CMSR11 and that this project has been car-
ried out under my supervision.

Mrs. Riya Baby Ms. Ajeena Joseph


Head of department Project Supervisor
DECLARATION

I, ARYA FRANCIS, hearby declare that the project "Coding The-


ory" is an original record of studies and bona fide project carried
out by me during the period of 2020-2023 under the guidance of
Ms. Ajeena Joseph, Department Of Mathematics, Don Bosco Arts
And Science College, Angadikadavu, Iritty, and that this project has
not been submitted by me elsewhere for the award of my degree,
diploma, title or recognition, before.

ARYA FRANCIS
DB20CMSR11
ACKNOWLEDGEMENT

I would like to express my sincere gratitude to several individuals


and organisation for supporting me throughout the course of the
successful accomplishment of this project.

First I wish to express my sincere gratitude to my supervisor,


Ms. Ajeena Joseph, Department Of Mathematics, Don Bosco Arts
And Science College, Angadikadavu, for her enthusiasm, patience,
insightful comments, helpful information, practical advice and un-
ceasing ideas that have helped me tremendously at all times in my
research and writing of this project.Without her support and guid-
ance, this project would’ve seemed an ordeal. I could not have imag-
ined having a better supervisor in my study.

I also wish to express my sincere thanks to all the faculty mem-


bers of the Department Of Mathematics at Don Bosco Arts And
Science College, Angadikkadavu, for their consistent support and
assistance.

Thank you to everyone at Don Bosco Arts And Science College


Angadikkadavu, including our Principal, Dr. Francis Karackat, man-
agement, teaching and non-teaching staff. It was great sharing
premises with all of you during last three years.

I’d also like to thank my friends and parents for their support
and encouragement as I worked on this assignment.

I shall always remain indebted to God, the almighty, who has


granted countless blessing, knowledge, and opportunity to the writer,
so that I have been finally able to accomplish this project.

Once again, thanks for all your encouragement.


CONTENTS

INTRODUCTION 1

PRELIMINARY 3

1 INTRODUCTION TO CODING THEORY 6


1.1 Coding Theory . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Basic Assumption . . . . . . . . . . . . . . . . . . . . . 7
1.3 Information Rate . . . . . . . . . . . . . . . . . . . . . . 9
1.4 The Effects Of Error Correction And Detection . . . . 9
1.5 Weight And Distance . . . . . . . . . . . . . . . . . . . . 10
1.6 Maximum Likelihood Decoding . . . . . . . . . . . . . . 11
1.7 Reliability Of MLD . . . . . . . . . . . . . . . . . . . . . 12
1.8 Error Detection and correction . . . . . . . . . . . . . . 12

2 LINEAR CODE 15
2.1 Linear code . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Two Important Subspace . . . . . . . . . . . . . . . . . 15
2.3 Independence, Basis, Dimension . . . . . . . . . . . . 16
2.4 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Bases for C=<S> and C⊥ . . . . . . . . . . . . . . . . . . 19
2.6 Generating Matrices and Encoding . . . . . . . . . . . 21
2.7 Parity Check Matrices . . . . . . . . . . . . . . . . . . . 22
2.8 Distance of Linear Code . . . . . . . . . . . . . . . . . . 23
2.9 Cosets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.10MLD for Linear Code . . . . . . . . . . . . . . . . . . . . 26

CONCLUSION 28

BIBLIOGRAPHY 29
INTRODUCTION

Coding theory is the study of the properties of codes and their re-
spective fitness for specific applications. Codes are used for data
compression, cryptography, error detection and correction, data
transmission and data storage. Codes are studied by various scien-
tific disciplines—such as information theory, electrical engineering,
mathematics, linguistics, and computer science— for the purpose
of designing efficient and reliable data transmission methods. This
typically involves the removal of redundancy and the correction or
detection of errors in the transmitted data.

Coding theory, sometimes called algebraic coding theory, deals


with the design of error-correcting codes for the reliable transmis-
sion of information across noisy channels. It makes use of classical
and modern algebraic techniques involving finite fields, group the-
ory, and polynomial algebra. It has connections with other areas of
discrete mathematics, especially number theory and the theory of
experimental designs.

The history of coding theory is in 1948, Claude Shannon pub-


lished "A Mathematical Theory of Communication", an article in two
parts in the July and October issues of the Bell System Technical
Journal. This work focuses on the problem of how best to encode
the information a sender wants to transmit. In this fundamen-
tal work he used tools in probability theory, developed by Norbert
Wiener, which were in their nascent stages of being applied to com-
munication theory at that time. Shannon developed information
entropy as a measure for the uncertainty in a message while es-
sentially inventing the field of information theory.The binary Golay

1
code was developed in 1949. It is an error-correcting code capable
of correcting up to three errors in each 24-bit word, and detect-
ing a fourth.

In first chapter ’Introduction to Coding Theory’ we discussed


about some basic concept of Coding Theory. It includes Basic As-
sumption where some fundamental definition and assumptions are
stated, Information Rate, The Effect of Error Correction and Detec-
tion, Weight and Distance, Maximum Likelihood Decoding, Reliabil-
ity of MLD, Error Detection and Correction. In the second chapter
’Linear Code’ we discuss about linear codes and its properties and
also some theorems. Linear Code is an important concept in Coding
Theory. Second chapter includes Independence, Basis and Dimen-
sion, Matrices, Finding Bases for C, Generating Matrices, Parity
Check Matrices, Equivalent Code, Distance of Linear Codes, Cosets,
MLD of Linear Code.

2
PRELIMINARY

Binary Number
A binary number is a number expressed in the basis-2 numerical
system or binary number system, a method of which uses only two
symbols: typically "0" and "1".

Binary Addition
Binary addition is the sum of two or more binary numbers. Binary
addition rules is,
0+0=0
0+1=1
1+0=1
1+1=0

Probability
Probability is the likelihood that an event will occur and is calcu-
lated by dividing the number of favourable outcomes by the total
number of possible outcomes.

Linear Combination
Let V be a vector space and S is non empty subset of V. A vector x in
V is said to be a linear combination of elements of S if there exist a
finite number of elements y1 ,y2 ,.....,yn in S and scalars α1 ,α2 ,.....,αn
in F such that x=α1 y1 +α2 y2 +......+αn yn

3
Span
Let S be a non-empty subset of a vector space V, the set of all linear
combination of S is called Span of S. It is denoted by [S] or Span(S).

Subspace
A subset W of a vector space V over a field F is called a subspace of
V if W is a vector space over F under the operation of addition and
scalar multiplication defined on V.

Subset
A set A is a subset of another set B if all element of the set A are
element of the set B.

Linearly Independent and Dependent


Let S={u1 ,u2 ,.....,un } be a subset of a vector space V, α1 ,α2 ,.....,αn be
scalars and α1 u1 +α2 u2 +......+αn un be a linear combination of S.

The set S={u1 ,u2 ,.....,un } is said to be Linearly Independent if


α1 u1 +α2 u2 +......+αn un =0 ⇒ α1 =α2 ,.....=αn = 0 (The only solution).

If there exist a non-trivial solution for α1 ,α2 ,.....,αn , That is atleast


one αi is not zero. Then the set is called Linearly Dependent.

Dimension
Let β be a basis of a vector space V if the number of vectors in β is
n then the vector space V is called n-dimensional vector space and
written as dim(V)=n.

Elementary Row Operation


The operation that are performed on rows of a matrix.

4
Rank
The number ’r’ with the following two properties is called the Rank
of the matrix.

1. There is atleast one non-zero minor of order r.

2. Every minor of order (r+1) is zero or vanish.

Cosets
Coset is subset of mathematical group consisting of all the products
obtained by multiplying fixed element of group by each of elements
of given subgroup, either on right or on left. Cosets are basic tool
in study of groups

5
CHAPTER 1

INTRODUCTION TO CODING
THEORY

1.1 Coding Theory


Coding theory is the study of methods for efficient and accurate
transfer of information from one place to another.

Definition 1.1. Channel


The physical medium through which the information is transmitted is
called a channel.
Definition 1.2. Noise
Undesirable disturbance which may cause the information received
to differ from what was transmitted is called noise.
Coding theory deals with the problem of dealing and correcting
transmission error caused by noise on the channel.Rough idea of a
general information transmission system.

6
The most important part of diagram is noise because without it
there would be no need for coding theory.

1.2 Basic Assumption


We state some fundamental definitions and assumptions which will
be applied in the coding theory.

Definition 1.3. Digits


The information to be sent is transmitted by a sequence of 0’s and
1’s which is called digits.

Definition 1.4. Word


Word is a sequence of digits.

Definition 1.5. Length of Word


The length of a word is the number of digits in the word.

Definition 1.6. Binary Code


A binary code is the set of words.
Eg: C= {00,01,10,11}

Definition 1.7. Block Code


A block code is code having all its words of the same length.

Definition 1.8. Codewords


The words that belong to a given code is called codewords. We denote
the number of codewords in a code c by |c|.

A word is transmitted by sending its digits one after other across


a binary channel. Each digit is transmitted mechanically, electri-
cally, magnetically or by one of two types of easily differentiated
pulses.

The codeword of length n is received as a word of length n. There


is no difficulty in identifying the beginning of the first word trans-
mitted. For example if we are using codeword of length 3 and receive

7
011011001, then the word received are in order 011,011,001.

Noise is scattered randomly as opposed to being in clumps is


called bursts. That is the probability of any one digit being affected
in transmission is same as that of any other digit and is not influ-
enced by errors made in neighbouring digits.

A binary channel is symmetric, if 0 and 1 are transmitted with


equal accuracy. The reliability of Binary Symmetric Channel(BSC)
is a real number p , 0≤p ≤ 1, where p is the probability that the
digit sent is the digit received.

If p is the probability that the digit received is the digit sent and
1-p is the probability that the digit received is not the digit sent.
Then the following diagram shows how BSC operates.

Remarks

• The total number of words of length n is 2n .

• If p=1 is the perfect channel then there is no chance of a digit


being altered in transmission. If all Channel is perfect. then
there is no need of coding theory. But no channel is perfect.

• Any channel with 0≤p≤ 21 can be converted into a channel with


1
2
≤ p ≤ 1. We are using BSC with probability 12 <p<1.

• Actually a channel p=0 is uninteresting because we can change


by converting 0’s into1 and 1’s into 0. This will not help in the
development coding theory.

8
1.3 Information Rate
The addition of digits to codeword may be improve error correction.
1
n
log2 |c| is the information rate of a code is the number that is
designs measure the proportion of each codeword.The information
rate ranges between 0 and 1.

1.4 The Effects Of Error Correction And De-


tection
To demonstrate the dramatic effect that the addition of a parity-
check digit to a code can have in recognizing when error occur, we
consider the following codes.
Suppose that all 211 words of length 11 are codewords; then no er-
ror is detected.
Let the reliability of the channel be p = 1-10−8 .
Suppose that digits are transmitted at the rate of 107 digits per sec-
ond.
The probability that the word is transmitted incorrectly is approxi-
mately 11p10 (1-p), is about 10
11
8.

107
11
108
. 11
= 0.1 words per second
are transmitted incorrectly without being detected. That is one
wrong word every 10 seconds, 6 a minute, 360 an hour, or 8640 a day!

Now suppose that a parity-check digit is added to each codeword,


so the number of 1’s in each of the 2048 codewords is even. Then
any single error is always detected, so at least 2 errors must occur if
a word is to be transmitted incorrectly without our knowledge. The
probability of at least
 2 error occurring is 1-p12 -12P11 (1-p) which is
approximated by 2 p10 (1-p)2 .
12

p=1-10−8 → 66
1016

Now approximately

9
66 107
1016 12
= 5.5×10−9

words per second are transmitted incorrectly without being detected.


That is about one error every 2000 days!

So if we are willing to reduce the information rate by lengthen-


ing the code from 11 to 12 we are very likely to know when errors
occur. To decide where these errors have actually occurred, we may
need to request the retransmission of the message. Physically this
means that either transmission must be held up until confirmation
is received or messages must be stored temporarily until retrans-
mission is requested; both alternatives may be very costly in time
or in storage space.

Therefore, at the expense of further increase in wordlength, it


may well be worth incorporating error- correction capabilities into
the code. Introducing such capabilities may also make encoding
and decoding more difficult, but will help to avoid the hidden costs
in time or space mentioned above.

One simple scheme to introduce error-correction is to form a rep-


etition code where each codeword is transmitted three times in suc-
cession. Then if at most one error is made per 33 digit codeword,
at least two of the three transmission will be correct. Then the in-
formation rate is 13 . So we add only 4 extra digit to each 11 digit
codeword. This produce a code with information rate 11 15
.

So it is our task to design codes with reasonable information


rates, low encoding and decoding costs and some error-correcting
or error-detecting capabilities that make the need for retransmis-
sion unlikely.

1.5 Weight And Distance


Let v be a word of length n. The Hamming weight or simply weight
of v is the number of times the digit 1 occur in v . We denote weight
of v as wt(v).

10
Example 1.5.1. wt(110101)= 4

Let v and w be words of length n. Then the Hamming Distance


or simply distance between v and w is the number of positions in
which v and w disagree. We denote distance between v and w as
d(v,w).

Eg: d(01011,00111)=2
Note
The distance between v and w is same as the weight of error pattern.
That is

d(v, w) = wt(v+w).

Example 1.5.2. d(v, w) =d(11010,01101)=4


wt(v+w) = wt(11010=01101) = wt(10111) = 4

The probability formula of error pattern u=v+w ,

ϕp (v,w)= pn−wt(u) (1-p)wt(u)

1.6 Maximum Likelihood Decoding


Two basic problems of coding,

1. Encoding : We have to determine a code to use for sending our


messages.

• First we select a positive integer k, the length of each bi-


nary word corresponding to a message k, k must be cho-
sen so that |M| ≤ |kk |= 2k .
• Next we decide how many digit we need to add to each
word of length k to ensure that as many errors can be
corrected or detected as we require.
• To transmit a particular message then transmitter finds
the word of length k assigned to that of message,then
transmits the codeword of length n corresponding to that
word of length k.

11
2. Decoding: A word w in kn is received. Now we proceed MLD,
for decoding which word v in c was sent.

(a) Complete Maximum Likelihood Decoding: If there is one


and only one word v in c close to w than any other word in
c, we decode w as v. if there are several words in c closest
to w, then we select arbitrary one of them and conclude
that it was the codeword sent.
(b) Incomplete MLD: if there is a unique word v in c closest to
w, then we decode w as v. but if there are several words
in c, at the same distance from w, then we request a re-
transmission. In some cases if the received word w is too
far away from any word in the code, we ask for a retrans-
mission.

1.7 Reliability Of MLD


The probability that if v is sent over a BSC of probability p then
IMLD correctly concludes that v was sent. θp (C,v) is the sum of all
the probabilities θp (v,w) as w ranges over L(v). That is,

θp (C,v)= w∈L(v) θp (v, w)


P

where L(v) all word which are close to v. The higher the probability
is, the more correctly the word can be decoded.

1.8 Error Detection and correction


Error Detecting Code
If v in C sent and w in kn is received, then u=v+w is the error pat-
tern. Any word u in kn can occur as an error pattern, and we wish
to know which error patterns C will detect.
We say that code C detects the error pattern u if and only if v+u
is not a codeword, for every v in C. In other words, u is detected if
for any transmitted codeword v, the decoder upon receiving v+u can
recognize that it is not a codeword and hence that some error has

12
occurred.

Example 1.8.1. Let C={001, 101, 110} for the error pattern u=010.
We calculate v+010 for all v in C.
001+010=011, 101+010=111, 110+010=100
None of the three words 011, 111 or 100 is in C, so C detects the error
pattern 010. On the other hand, for the error pattern u= 100,
001+100=101, 101+100=001, 110+100=010
Since at least one of these sums is in C, C does not detect the er-
ror pattern 100.

Error Correcting Code


If a word v in a code C is transmitted over BSC and w is the received
resulting in the error pattern u=v+w. Then code C corrects the error
pattern u, if for all v in C, v+u is closer to v than to any other word
in C. Also, a code is said to be a t error correcting code if it corrects
all error patterns of weight at most t and does not correct at least
one error pattern of weight t+1.
Example 1.8.2. Let C={000,111}
• Take the error pattern u=010. For v=000

d(000,v+u)=d(000,010)=1 and
d(111,v+u)=d(111,010)=2

And for v=111,

d(000,v+u)=d(000,101)=2
d(111,v+u)=d(111,101)=1

Thus C corrects the error pattern 010.

• Now take the error pattern u=110. For v=000

13
d(000,v+u)=d(000,110)=2 and
d(111,v+u)=d(111,110)=1

Since v+u is not closer to v=000 than to 111. C does not correct
the error pattern 110.

14
CHAPTER 2

LINEAR CODE

2.1 Linear code


A code C is called a linear code if v+w is a word in C whenever v and
w are in C. That is, a linear code is a code which is closed under
addition of words.

Example 2.1.1. C = {000, 111} is a linear code, since all four of the sums.
000+000=000
000+111=111
111+000=111
111+111=000
are in C. But C1 = {000, 001, 101} is not a linear code, since 001 and
101 are in C1 but 001+101 is not in C1 .

2.2 Two Important Subspace


The vector w is said to be a linear combination of vectors v1 , v2 ,.......vK ,
if there are scalars a1 , a2 ,.......ak as such that,
w=a1 v1 +a2 v2 +......+ak vk
The set of all linear combinations of the vectors in a given set
S={v1 ,v2 ,.......vk } is called the linear span of S, and is denoted by <S>.

15
If S is empty, we define <S>= {0}.
In linear algebra it is shown that for any subset S of a vector space V,
the linear span <S> is a subspace of V, called the subspace spanned
or generated by S.

Theorem 2.2.1. For any subset S of Kn , the code C=<S> generated


by S consists precisely of the following words the zero word, all words
in S, and all sums of two or more words in S.

Example 2.2.1. Let S = {0100, 0011, 1100}. Then the code C =<S>
generated by S consists of

0000, 0100, 0100+0011=0111, 0100+0011+1100=1011,


1100, 0011,0100+1100=1000, 0011+1100=1111;

that is, C=<S>={0000,0100,0011,1100,0111,1000,111,1011}.

2.3 Independence, Basis, Dimension


The main objective is to find an efficient way to describe a linear
code without having to list all the codewords.
A set S={v1 ,v2 ,.......vk } of vectors is linearly dependent if these are
scalars a1 , a2 ,.......ak not all zero such that,

a1 v1 +a2 v2 +......+ak vk =0

Otherwise the set S is linearly independent.


The test for linear independence, then, is to form the vector equation
using arbitrary scalars. All the scalars a1 , a2 ,.......ak to be 0, then
the set S is linearly independent. If at least one ai can be chosen to
be non-zero then S is linearly independent.
Any set of vectors containing the zero vectors is linearly dependent.
A nonempty subset B, of vectors from a vector space V is a basis for
V if both:

1. B spans V (that is, <B>=V)

2. B is linearly independent set.

16
Note
Any Linearly independent set B is automatically a basis for <B>.
Also since any linearly independent set S of vectors that contains a
nonzero word always contains a largest independent subset B, we
can extract from S a basis B for <S>. If S={0} then we say that the
basis of S is the empty set Q.

Theorem 2.3.1. A linear code of dimension k contains precisely 2k


codewords.

Theorem 2.3.2. Let C=<S> be the linear code generated by a subset


S of kn . Then (dimension of C)+(dimension of C⊥ )=n

Theorem
Qk−1 k 2.3.3. A linear code of dimension k has precisely
1 i
k! i=0 (2 − 2 ) different bases.

Example
Q3 2.3.1. The linear code k4 and hence
1 4 i 1 4 4 4 2 4 3
4! i=0 (2 − 2 )= 4! (2 -1)(2 -2)(2 -2 )(2 -2 )= 840 different bases.
n
Any linear code contained in k , for n≥4 which has dimension 4 also
has 840 different bases.

2.4 Matrices
An m×n matrix is a rectangular array of scalars with m rows and n columns.
If A is an m × n matrix and B is an n×p matrix, then the product
AB is the m×p matrix which has for its (i,j)th entry.

 
  1 0 1  
1 0 1 1 0
 1 1
= 1 0 0
0 1 0 1 1 0 1 1 1 1
1 0 0

There are two types of elementary row operations which may be


performed on a matrix over K. They are:

1. interchanging two rows

2. replacing a row by itself plus another row

17
Two matrices are row equivalent if one can be obtained from the
other by a sequence of elementary row operators.
A1 in a matrix M (over K) is called a leading 1 if there are no 1s to its
left in the same row, and a column of M is called a leading column
if it contains a leading 1. M is in Row Echelon Form (REF) if the
zero rows of M (if any) are all at the bottom, and each leading 1 is
to the right of the leading 1s in the rows above.
If further, each leading column contains exactly one 1, M is in Re-
duced Row Echelon Form (RREF).

Example 2.4.1. Find the REF for the matrix M below using elemen-
tary row
 operation.

1 0 1 1
1 1 0 1
M= 1 1 1 1

1 0 0 0
 
1 0 1 1
0 1 1 0
⇒
0
 (add row 1 to row 2, row 3 and row 4)
1 0 0
0 0 1 1
 
1 0 1 1
0 1 1 0
⇒
0
 (add row 2 to row 3)
0 1 0
0 0 1 1
 
1 0 1 1
0 1 1 0
⇒
0
 (add row 3 to row 4)
0 1 0
0 0 0 1

So the REF of matrix M is

18
 
1 0 1 1
0 1 1 0
 
0 0 1 0
0 0 0 1

Example 2.4.2. Find the RREF for the matrix M below using elemen-
tary 
row operation.

1 0 1 1
M= 1 0 1 0
1 1 0 1
 
1 0 1 1
→ 0 0 0 1(add row 1 to row 2 and to row 3)
0 1 1 0
 
1 0 1 1
→ 0 1 1 0(interchange row 2 and 3)
0 0 0 1
 
1 0 1 0
→ 0 1 1 0(add row 3 to row 1)
0 0 0 1

So the RREF of matrix M is


 
1 0 1 0
0 1 1 0
0 0 0 1

2.5 Bases for C=<S> and C⊥


We develop algorithms for finding bases for a linear code and its
dual.
Let S be a nonempty subset of Kn . The first two algorithms provide
a basis for C=<S>, the linear code generated by S.

Algorithm 2.5.1. Form the matrix A whose rows are the words in S.
Use elementary row operations to find a REF of A. Then the nonzero

19
rows of the REF form a basis for C =<S>.
The algorithm works because the rows of A generate C and elemen-
tary row operations simply interchange words or replace one word
(row) with another in C giving a new set of codewords which still gen-
erates C. Clearly the nonzero rows of a matrix in REF are linearly in-
dependent.

Example 2.5.1. We find a basis for the linear code C=<S> for
S = {11101,
 10110,
 01011, 11010}
1 1 1 0 1
1 0 1 1 0
A= 0 1 0 1 1

1 1 0 1 0
 
1 1 1 0 1
0 1 0 1 1
→
0
 (add row 1 to row 2 and to row 4)
1 0 1 1
0 0 1 1 1
 
1 1 1 0 1
0 1 0 1 1
→
0
 (interchange row 3 to row 4)
0 1 1 1
0 1 0 1 1
 
1 1 1 0 1
0 1 0 1 1
→
0
 (add row 2 to row 4)
0 1 1 1
0 0 0 0 0

The last matrix is a REF of A. By Algorithm 2.5.1. {11101, 01011,


00111} is a basis for C=<S> . Another REF of A is

 
1 1 1 0 1
0 1 1 0 0
 
0 0 1 1 1
0 0 0 0 0

20
So {11101, 01100, 00111} is also a basis for C=<S >. Note that Al-
gorithm 2.5.1 does not produce a unique basis for <S>, nor are the
words in the basis necessarily in the given set S.

Algorithm 2.5.2. Form the matrix A whose rows are the words in S.
Use elementary row operations to place A in RREF. Let G be the k×n
matrix consisting of all the nonzero rows of the RREF. Let X be the
k×(n-k) matrix obtained from G by deleting the leading columns of G.
Form an n×(n-k) matrix H as follows:

1. In the rows of H corresponding to the leading columns of G,


place, in order,the rows of X.

2. In the remaining n-k rows of H, place, in order, the rows of the


(n-k)×(n-k) identity matrix I.

Then the columns of H form a basis for C⊥

2.6 Generating Matrices and Encoding


The rank of a matrix over K is the number of nonzero rows in any
REF of the matrix. The dimension k of the code C is the dimension
of C, as a subspace of Kn . If C also has length n and distance d,
then we refer to C as an (n, k, d) linear code.
If C is a linear code of length n and dimension k, then any matrix
whose rows form a basis for C is called a generator matrix for C.
Note
A generator matrix for C must have k rows and n columns and it
must have rank k.

Theorem 2.6.1. A matrix G is a generator matrix for some linear


code C if and only if the rows of G are linearly independent, that is,
if and only if the rank of G is equal to the number of rows of G.

Theorem 2.6.2. If G is a generator matrix for a linear code C, then


any matrix row equivalent to G is also a generator matrix for C. In
particular, any linear code has a generator matrix in RREF.

21
Example 2.6.1. We find a generator matrix for the code
C={0000,1110,0111,1001}. Using Algorithm 2.5.1,
       
0 0 0 0 1 1 1 0 1 1 1 0 1 1 1 0
1 1 1 0 → 0 1 1 1 → 0 1 1 1 → 0 1 1 1
     
A= 
0 1 1 1  1 0 0 1  0 1 1 1   0 0 0 0
1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
 
1 1 1 0
so G= is a generator matrix for C. By Algorithm 2.5.2,
0 1 1 1
 
1 0 0 1  
0 1 1 1 1 0 0 1
since the RREF of A is  ,G= is also a gener-
0 0 0 0 1 0 1 1 1
 

0 0 0 0
ator matrix for C.

2.7 Parity Check Matrices


A matrix H is called a parity-check matrix for a linear code C if
the columns of H form a basis for the dual code C. If C has length
n and dimension k, then, since the sum of the dimensions of C
and C⊥ is in any parity-check matrix for C must have n rows, n-k
columns and rank n - k.
Theorem 2.7.1. A matrix H is a parity-check matrix for some linear
code C if and only if the columns of H are linearly independent
Theorem 2.7.2. If H is a parity-check matrix for a linear code C of
length n, then C consists precisely of all words v in Kn such that vH=0.
Theorem 2.7.3. Matrices G and H are generating and parity-check
matrices, respectively, for some linear code C if and only if
1. the rows of G are linearly independent,
2. the columns of H are linearly independent,
3. the number of rows of G plus the number of columns of H equals
the number of columns of G which equals the number of rows of
H,

22
4. GH=0
Theorem 2.7.4. H is a parity-check matrix of C if and only if HT is
a generator matrix for C⊥
Example 2.7.1. We find a parity check matrix for the code
C={0000,1110,0111,1001} of Example 2.6.1. There we found that
 
10 01  
G1 = = I X
01 11
is a generator matrix for C which is in RREF. By Algorithm 2.5.2, we
connect H
 
  01
X  11
H= = 
I 10
01
is a parity check matrix for C. Note that vH= 00 for all words v in C.

2.8 Distance of Linear Code


The distance of a linear code is the minimum weight of any nonzero
codeword. The distance of a linear code can also be determined
from a parity-check matrix for the code.
Theorem 2.8.1. Let H be a parity-check matrix for a linear code C.
Then C has distance d if and only if any set of d-1 rows of H is linearly
independent, and at least one set of d rows of linearly dependent.
Example 2.8.1. Let C be the linear code with parity-check matrix
 
110
011
 
H=
100

010
001
By inspection it is seen that no two rows of H sum to 000, so any two
rows of H are linearly independent. But rows 1, 3, and 4, for instance
sum to 000, and hence are linearly dependent. Therefore d-1=2, so
the distance of C is d = 3.

23
2.9 Cosets
If C is a linear code of length n, and if u is any word of length n,
we define the coset of C determined by u to be the set of all words
of the form v+u as v ranges over all the words in C. We denote this
coset by C+u. Thus,

C + u ={v+u|v ∈ C}.

Example 2.9.1. Let C={000, 111}, and let u= 101. Then,

C+101={000+ 101, 111+101} = {101,010}.

Note that also

C+111= {000+111, 111+111}={111,000} = C

and

C+010= {000 +010, 111+010)}={010, 101}= C+101.

Theorem 2.9.1. Let C be a linear code of length n. Let u and v be


words of length of n.

1. If u is in the coset C + v, then C + u = C + v; that is, each word


in a coset determines that coset.

2. The word u is in the coset C + u.

3. If u + v is in C, then u and v are in the same coset.

4. If u + v is not in C, then u and v are in different cosets.

5. Every word in Kn is contained in one and only one coset of C;


that is, either C + u = C + v, or C + u and C + v have no words
in common.

6. |C + u|= |C|; that is, the number of words in a coset of C is


equal to the number of words in the code C.

7. If C has dimension k, then there are exactly 2n−k different cosets


of C, and each coset contains exactly 2k words.

24
8. The code C itself is one of its cosets.

Example 2.9.2. We list the cosets of the code

C= { 0000,1011,0101,1110}

• C itself is a coset.(Theorem 2.10.1 (8))

• Every word in C will determine the coset C by (tTheorem 2.10.1


(1) and (5)), so we pick a word u in K4 not in C. For later use in
decoding, it will help to pick u of smallest weight possible. So
let’s take u = 1000. Then we get the coset,

C + 1000 ={ 1000,0011,1101,0110}.

• Now pick another word, of small weight, in K but not in C or


C+1000, say 0100. Form another coset,

C + 0100 = {0100, 1111, 0001, 1010}.

• Repeating the process with 0010 yields the coset

C + 0010 = {0010,1001,0111,1100}

• The code C has dimension k = 2. Then,

2n−k = 24−2 = 22 = 4

We have listed 4 cosets with 2k = 2n = 4 words.and every word


in K4 is accounted for appearing in exactly one coset.

• Also observe that 0001 + 1010= 1011 is in C, thus 0001 and


1010 are in the same coset, namely C+0100 (see (3)). On the
other hand, 0100 + 0010= 0110 is not in C, and 0100 and 0010
are in different cosets (see (4)).

25
2.10 MLD for Linear Code
Let C be a linear code. Assume the codeword v in C is transmitted
and the word w is received, resulting in the error pattern u = v + w.
Then w + u = v is in C, so the error pattern u and the received word
w are in the same coset of C by (3) of Theorem 2.10.1.

Since error patterns of small weight are the most likely to occur,
here is how MLD works for a linear code C. Upon receiving the word
w, we choose a word u of least weight in the coset C + w (which must
contain w) and conclude that v = w + u was the word sent.
Example 2.10.1. Let C={0000, 1011, 0101, 1110}. The cosets of C
(Example 2.10.2) are

Suppose w= 1101 is received.


C + w = C + 1101= {1101,0110,1000,0011}
The coset C + w= C+ 1101 containing w is the second one listed.
The word of least weight in this coset is u= 1000, which we choose
as the error pattern.
We conclude that,
v = w + u = 1101 + 1000 = 0101
0101was the most likely codeword sent.

Now suppose w=1111 is received.


C + w = C + 1111={1111,0100,1010,0001}
In the coset C+w containing 1111 there are two words of smallest
weight, 0100 and 0001. Since we are doing CMLD, we arbitrarily
select one of these, say u= 0100, for the error pattern, and conclude
that v = w + u= 1111 + 0100 = 1011 was a most likely codeword sent.

26
Theorem 2.10.1. Let C be a linear code of length n. Let H be a
parity-check matrix for C. Let w and u be words in Kn .

1. wH = 0 if and only if w is a codeword in C.

2. wH = uH if and only if w and u lie in the same coset of C.

3. If u is the error pattern in a received word w, then uH is the sum


of the rows of H that correspond to the positions in which errors
occurred in transmission.

27
CONCLUSION

Our aim was to take a note on coding theory by its breath of cov-
erage. Coding theory is the study of properties of codes and their
respective fitness for specific applications. Codes are used for data
compression, cryptography, error detection and correction, data
transmission and data storage. Codes are studied by various scien-
tific disciplines such as information theory, electrical engineering,
mathematics, linguistics and computer science-for the purpose of
designing efficient and reliable data transmission methods. This
typically involves the removal of redundancy and the correction or
detection of errors in the transmitted data. This project work helps
us to know more about coding theory.
I have much pleasure in conveying my heart full thanks to my teach-
ers and colleagues.

28
BIBLIOGRAPHY

1. D.G.Hoffman, D.A. Leonard, C.C. Lindner, K.T. Phelps, C.A.


Rodger and J.R. Wall,CODING THEORY The Essentials, Mar-
cel Dekkar, Inc., 1991.

2. Richard W Hamming, Coding and Information Theory, Prentice-


Hall, Inc., 1986.

3. Steven Roman, Coding and Information Theory, Springer


Science and Business Media, 1992.

4. Wikipedia, Coding Theory,


<https://en.wikipedia.org › wiki › Coding-theory>

5. Wikipedia, Linear Code,


<https://en.wikipedia.org › wiki › Linear-code>

29

You might also like