0% found this document useful (0 votes)

23 views45 pages

Information Theory and Source Coding

The document provides an overview of Information Theory and Source Coding, covering key concepts such as entropy, mutual information, and various coding techniques like Shannon-Fano and Huffman coding. It explains the principles of transmitting messages through discrete signals, the properties of information, and the importance of efficient data representation. Additionally, it discusses the classification of information sources and the trade-offs involved in coding for data compression and error correction.

Uploaded by

elisks2020

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views45 pages

Information Theory and Source Coding

Uploaded by

elisks2020

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

Information Theory & Source Coding

1
Contents
Information Theory:
➢ Discrete messages
➢ Concept of amount of information and its properties
➢ Average information
➢ Entropy and its properties
➢ Information rate
➢ Mutual information and its properties
➢ Illustrative Problems

Source Coding:
➢ Introduction
➢ Advantages
➢ Hartley Shannon’s theorem
➢ Bandwidth –S/N trade off
➢ Shanon- Fano coding,
➢ Huffman coding
➢ Illustrative Problems

2
Information Theory

Information theory deals with representation and the transfer of information.

There are two fundamentally different ways to transmit messages: via discrete signals
and via continuous signals ..... For example, the letters of the English alphabet are commonly
thought of as discrete signals.
Information sources

Definition:

The set of source symbols is called the source alphabet, and the elements of the set are
called the symbols or letters.
The number of possible answers ‘ r ’ should be linked to “information.”
“Information” should be additive in some sense.
We deﬁne the following measure of information:

Where ‘ r ’ is the number of all possible outcome so far an do m message U.

Using this deﬁnition we can conﬁrm that it has the wanted property of additivity:

The basis ‘b’ of the logarithm b is only a change of units without actually changing the
amount of information it describes.

Classification of information sources

1. Discrete memory less.

2. Memory.

Discrete memory less source (DMS) can be characterized by “the list of the symbols, the
probability assignment to these symbols, and the specification of the rate of generating these
symbols by the source”.
1. Information should be proportion to the uncertainty of an outcome.
2. Information contained in independent outcome should add.
Scope of Information Theory

1. Determine the irreducible limit below which a signal cannot be compressed.

2. Deduce the ultimate transmission rate for reliable communication over a noisy channel.
3. Define Channel Capacity - the intrinsic ability of a channel to convey information.

3
The basic setup in Information Theory has:
– a source,
– a channel and
– destination.
The output from source is conveyed through the channel and received at the destination.
The source is a random variable S
which takes symbols from a finite alphabet i.e.,

S = {s0, s1, s2, ・・・, sk−1}

With probabilities

P(S = sk) = pk where k = 0, 1, 2, ・・・, k − 1

and
k−1,Xk=0 ,pk = 1

The following assumptions are made about the source

1. Source generates symbols that are statistically independent.

2. Source is memory less i.e., the choice of present symbol does not depend on the previous
choices.

Properties of Information

1. Information conveyed by a deterministic event is nothing

2. Information is always positive.
3. Information is never lost.
4. More information is conveyed by a less probable event than a more probable event

Entropy:
The Entropy (H(s)) of a source is defined as the average information generated by a
discrete memory less source.

4
Information content of a symbol:

Let us consider a discrete memory less source (DMS) denoted by X and having the
alphabet {U1, U2, U3, ……Um}. The information content of the symbol xi, denoted by I(xi) is
defined as

I (U) = log b = - log b P(U)

Where P (U) is the probability of occurrence of symbol U

Units of I(xi):

For two important and one unimportant special cases of b it has been agreed to use the
following names for these units:
b =2(log2): bit,

b = e (ln): nat (natural logarithm),

b =10(log10): Hartley.

The conversation of these units to other units is given as

log2a=

Uncertainty or Entropy (i.e Average information)

Deﬁnition:

In order to get the information content of the symbol, the flow information on the
symbol can fluctuate widely because of randomness involved into the section of symbols.

The uncertainty or entropy of a discrete random variable (RV) ‘U’ is deﬁned as

H(U)= E[I(u)]=

5
Where PU (·) denotes the probability mass function (PMF) 2 of the RV U, and where
the support of P U is deﬁned as

We will usually neglect to mention “support” when we sum over PU (u) · logb PU (u), i.e., we
implicitly assume that we exclude all u
With zero probability PU (u) =0.

Entropy for binary source

It may be noted that for a binary source U which genets independent symbols 0 and 1
with equal probability, the source entropy H (u) is

H (u) = - log2 - log2 = 1 b/symbol

Bounds on H (U)

If U has r possible values, then 0 ≤ H(U) ≤ log r,

Where

H(U)=0 if, and only if, PU(u)=1 for some u,

H(U)=log r if, and only if, PU(u)= 1/r ∀ u.

Hence, H(U) ≥ 0.Equalitycanonlybeachievedif −PU(u)log2 PU(u)=0

For all u ∈ supp (PU), i.e., PU (u) =1forall u ∈ supp (PU).

To derive the upper bound we use at rick that is quite common in.

Formation theory: We take the deference and try to show that it must be non positive.

6
Equality can only be achieved if

1. In the IT Inequality ξ =1,i.e.,if 1r·PU(u)=1=⇒ PU(u)= 1r ,for all u;

2. |supp (PU)| = r.

Note that if Condition1 is satisﬁed, Condition 2 is also satisﬁed.

7
Conditional Entropy

Similar to probability of random vectors, there is nothing really new about conditional
probabilities given that a particular event Y = y has occurred.
The conditional entropy or conditional uncertainty of the RV X given the event Y = y is
deﬁned as

Note that the deﬁnition is identical to before apart from that everything is conditioned
on the event Y = y

Note that the conditional entropy given the event Y = y is a function of y. Since Y is
also a RV, we can now average over all possible events Y = y according to the probabilities
of each event. This will lead to the averaged.

Mutual Information

Although conditional entropy can tell us when two variables are completely
independent, it is not an adequate measure of dependence. A small value for H(Y| X) may
implies that X tells us a great deal about Y or that H(Y) is small to begin with. Thus, we
measure dependence using mutual information:

I(X,Y) =H(Y)–H(Y|X)

Mutual information is a measure of the reduction of randomness of a variable given

knowledge of another variable. Using properties of logarithms, we can derive several equiva-
lent definitions

8
I(X,Y)=H(X)–H(X| Y)

I(X,Y) = H(X)+H(Y)–H(X,Y) = I(Y,X)

In addition to the definitions above, it is useful to realize that mutual information is a

particular case of the Kullback-Leibler divergence. The KL divergence is defined as:

KL divergence measures the difference between two distributions. It is sometimes called the
relative entropy. It is always non-negative and zero only when p=q; however, it is not a
distance because it is not symmetric.

In terms of KL divergence, mutual information is:

In other words, mutual information is a measure of the difference between the joint
probability and product of the individual probabilities. These two distributions are equivalent
only when X and Y are independent, and diverge as X and Y become more dependent.

Source coding
Coding theory is the study of the properties of codes and their respective fitness for
specific applications. Codes are used for data compression, cryptography, error-
correction, and networking. Codes are studied by various scientific disciplines—such as
information theory, electrical engineering, mathematics, linguistics, and computer
science—for the purpose of designing efficient and reliable data transmission methods.
This typically involves the removal of redundancy and the correction or detection of
errors in the transmitted data.
The aim of source coding is to take the source data and make it smaller.

All source models in information theory may be viewed as random process or random
sequence models. Let us consider the example of a discrete memory less source
(DMS), which is a simple random sequence model.

A DMS is a source whose output is a sequence of letters such that each letter is
independently selected from a fixed alphabet consisting of letters; say a1, a2 ,

9
……….ak. The letters in the source output sequence are assumed to be random
and statistically

Independent of each other. A fixed probability assignment for the occurrence of

each letter is also assumed. Let us, consider a small example to appreciate the
importance of probability assignment of the source letters.

Let us consider a source with four letters a1, a2, a3 and a4 with P(a1)=0.5,

P(a2)=0.25, P(a3)= 0.13, P(a4)=0.12. Let us decide to go for binary coding of these

four

Source letters While this can be done in multiple ways, two encoded representations
are shown below:

Code Representation#1:

a1: 00, a2:01, a3:10, a4:11

Code Representation#2:

a1: 0, a2:10, a3:001, a4:110

It is easy to see that in method #1 the probability assignment of a source letter has not
been considered and all letters have been represented by two bits each. However in

The second method only a1 has been encoded in one bit, a2 in two bits and the
remaining two in three bits. It is easy to see that the average number of bits to be used
per source letter for the two methods is not the same. ( a for method #1=2 bits per
letter and a for method #2 < 2 bits per letter). So, if we consider the issue of encoding
a long sequence of

Letters we have to transmit less number of bits following the second method. This
is an important aspect of source coding operation in general. At this point, let us
note
a) We observe that assignment of small number of bits to more probable letters and
assignment of larger number of bits to less probable letters (or symbols) may lead to
efficient source encoding scheme.

10
b) However, one has to take additional care while transmitting the encoded letters. A
careful inspection of the binary representation of the symbols in method #2 reveals
that it may lead to confusion (at the decoder end) in deciding the end of binary
representation of a letter and beginning of the subsequent letter.

So a source-encoding scheme should ensure that

1) The average number of coded bits (or letters in general) required per source letter
is as small as possible and
2) The source letters can be fully retrieved from a received encoded sequence.

11
Shannon-Fano Code

Shannon–Fano coding, named after Claude Elwood Shannon and Robert Fano, is a technique
for constructing a prefix code based on a set of symbols and their probabilities. It is
suboptimal in the sense that it does not achieve the lowest possible expected codeword length
like Huffman coding; however unlike Huffman coding, it does guarantee that all codeword
lengths are within one bit of their theoretical ideal I(x) =−log P(x).

In Shannon–Fano coding, the symbols are arranged in order from most probable to least
probable, and then divided into two sets whose total probabilities are as close as possible to
being equal. All symbols then have the first digits of their codes assigned; symbols in the first
set receive "0" and symbols in the second set receive "1". As long as any sets with more than
one member remain, the same process is repeated on those sets, to determine successive
digits of their codes. When a set has been reduced to one symbol, of course, this means the
symbol's code is complete and will not form the prefix of any other symbol's code.

The algorithm works, and it produces fairly efficient variable-length encodings; when the two
smaller sets produced by a partitioning are in fact of equal probability, the one bit of
information used to distinguish them is used most efficiently. Unfortunately, Shannon–Fano
does not always produce optimal prefix codes.

For this reason, Shannon–Fano is almost never used; Huffman coding is almost as
computationally simple and produces prefix codes that always achieve the lowest expected
code word length. Shannon–Fano coding is used in the IMPLODE compression method,
which is part of the ZIP file format, where it is desired to apply a simple algorithm with high
performance and minimum requirements for programming.

12
Shannon-Fano Algorithm:
A Shannon–Fano tree is built according to a specification designed to define an
effective code table. The actual algorithm is simple:
For a given list of symbols, develop a corresponding list of probabilities or frequency
counts so that each symbol’s relative frequency of occurrence is known.

Sort the lists of symbols according to frequency, with the most frequently
occurring
Symbols at the left and the least common at the right.
Divide the list into two parts, with the total frequency counts of the left part being
as
Close to the total of the right as possible.
The left part of the list is assigned the binary digit 0, and the right part is assigned
the digit 1. This means that the codes for the symbols in the first part will all start
with 0, and the codes in the second part will all start with 1.

Recursively apply the steps 3 and 4 to each of the two halves, subdividing groups
and adding bits to the codes until each symbol has become a corresponding code leaf
on the tree.

Example:
The source of information A generates the symbols {A0, A1, A2, A3 and A4} with the
corresponding probabilities {0.4, 0.3, 0.15, 0.1 and 0.05}. Encoding the source symbols
using binary encoder and Shannon-Fano encoder gives

Source Symbol Pi Binary Code Shannon-Fano

A0 0.4 000 0
A1 0.3 001 10
A2 0.15 010 110
A3 0.1 011 1110
A4 0.05 100 1111
Lavg H = 2.0087 3 2.05

13
Shanon-Fano code is a top-down approach. Constructing the code tree, we get

14
Binary Huffman Coding (an optimum variable-length source coding scheme)
In Binary Huffman Coding each source letter is converted into a binary code
word. It is a prefix condition code ensuring minimum average length per source letter in
bits.
Let the source letters a1, a 2, ……….aK have probabilities P(a1), P(a2),………….
P(aK) and let us assume that P(a1) ≥ P(a2) ≥ P(a 3)≥…. ≥ P(aK).

We now consider a simple example to illustrate the steps for Huffman coding.

Steps to calculate Huffman Coding

Example Let us consider a discrete memory less source with six letters having

P(a1)=0.3,P(a2)=0.2, P(a 3)=0.15, P(a 4)=0.15, P(a5)=0.12 and P(a6)=0.08.

Arrange the letters in descending order of their probability (here they are
arranged).
Consider the last two probabilities. Tie up the last two probabilities. Assign, say, 0
to the last digit of representation for the least probable letter (a6) and 1 to the last
digit of representation for the second least probable letter (a5). That is, assign ‘1’
to the upper arm of the tree and ‘0’ to the lower arm.

(3) Now, add the two probabilities and imagine a new letter, say b1, substituting for a6
and a5. So P(b1) =0.2. Check whether a4 and b1are the least likely letters. If not,
reorder the letters as per Step#1 and add the probabilities of two least likely letters.
For our example, it leads to:
P(a1)=0.3, P(a2)=0.2, P(b1)=0.2, P(a3)=0.15 and P(a4)=0.15

15
(4) Now go to Step#2 and start with the reduced ensemble consisting of a1 , a2 , a3 ,

a4 and b1. Our example results in:

Here we imagine another letter b1, with P(b2)=0.3.

Continue till the first digits of the most reduced ensemble of two letters are
assigned a ‘1’ and a ‘0’.

Again go back to the step (2): P(a1)=0.3, P(b2)=0.3, P(a2)=0.2 and P(b1)=0.2.
Now we consider the last two probabilities:

So, P(b3)=0.4. Following Step#2 again, we get, P(b3)=0.4, P(a1)=0.3 and

P(b2)=0.3.
Next two probabilities lead to:

With P(b4) = 0.6. Finally we get only two probabilities

16
6. Now, read the code tree inward, starting from the root, and construct the
code words. The first digit of a codeword appears first while reading the code tree
inward.

Hence, the final representation is: a1=11, a2=01, a3=101, a4=100, a5=001, a6=000.
A few observations on the preceding example

1. The event with maximum probability has least number of bits

2. Prefix condition is satisfied. No representation of one letter is prefix for other.

Prefix condition says that representation of any letter should not be a part of any
other letter.

3. Average length/letter (in bits) after coding is

= ∑P (ai )ni = 2.5 bits/letter.

4. Note that the entropy of the source is: H(X)=2.465 bits/symbol. Average length
per source letter after Huffman coding is a little bit more but close to the source
entropy. In fact, the following celebrated theorem due to C. E. Shannon sets the
limiting value of average length of code words from a DMS.

Shannon–Hartley theorem

In information theory, the Shannon–Hartley theorem tells the maximum rate at which
information can be transmitted over a communications channel of a specified bandwidth in
the presence of noise. It is an application of the noisy-channel coding theorem to the
archetypal case of a continuous-time analog communications channel subject to Gaussian
noise. The theorem establishes Shannon's channel capacity for such a communication link, a

17
bound on the maximum amount of error-free information per time unit that can be transmitted
with a specified bandwidth in the presence of the noise interference, assuming that the signal
power is bounded, and that the Gaussian noise process is characterized by a known power or
power spectral density.
The law is named after Claude Shannon and Ralph Hartley.

Hartley Shannon Law

The theory behind designing and analyzing channel codes is called Shannon’s noisy
channel coding theorem. It puts an upper limit on the amount of information you can
send in a noisy channel using a perfect channel code. This is given by the following
equation:

where C is the upper bound on the capacity of the channel (bit/s), B is the
bandwidth of the channel (Hz) and SNR is the Signal-to-Noise ratio (unit less).

Bandwidth-S/N Tradeoff

The expression of the channel capacity of the Gaussian channel makes intuitive
sense:

1. As the bandwidth of the channel increases, it is possible to make faster

changes in the information signal, thereby increasing the information rate.

2 As S/N increases, one can increase the information rate while still preventing errors
due to noise.

3. For no noise, S/N tends to infinity and an infinite information rate is

possible irrespective of bandwidth.

Thus we may trade off bandwidth for SNR. For example, if S/N = 7 and B = 4kHz,
then the channel capacity is C = 12 ×103 bits/s. If the SNR increases to S/N = 15 and B
is decreased to 3kHz, the channel capacity remains the same. However, as B tends to
1, the channel capacity does not become infinite since, with an increase in bandwidth,
the noise power also increases. If the noise power spectral density is ɳ/2, then the total
noise power is N = ɳB, so the Shannon-Hartley law becomes

18
19
Linear Block Codes

Introduction
Coding theory is concerned with the transmission of data
across noisy channels and the recovery of corrupted messages. It has found
widespread applications in electrical engineering, digital communication,
mathematics and computer science. The transmission of the data over the channel depends
upon two parameters. They are transmitted power and channel bandwidth. The power spectral
density of channel noise and these two parameters determine signal to noise power ratio.

The signal to noise power ratio determine the probability of error of the modulation
scheme. Errors are introduced in the data when it passes through the channel. The channel
noise interferes the signal. The signal power is reduced. For the given signal to noise ratio, the
error probability can be reduced further by using coding techniques. The coding techniques
also reduce signal to noise power ratio for fixed probability of error.

Principle of block coding

For the block of k message bits, (n-k) parity bits or check bits are added. Hence the
total bits at the output of channel encoder are ‘n’. Such codes are called (n,k)block
codes.Figure illustrates this concept.

Message block Code block

Channel
input output
Encoder

Message Message Check bits

k bits k (n-k)
n bits

Figure: Functional block diagram of block coder

Types are

Systematic codes:

In the systematic block code, the message bits appear at the beginning of the code
word. The message appears first and then check bits are transmitted in a block. This type of
code is called systematic code.

Nonsystematic codes:

In the nonsystematic block code it is not possible to identify the message bits and
check bits. They are mixed in the block.

1
Consider the binary codes and all the transmitted digits are binary.

Linear Block Codes

A code is linear if the sum of any two code vectors produces another code vector.
This shows that any code vector can be expressed as a linear combination of other code
vectors. Consider that the particular code vector consists of m1,m2, m3,…mk message bits and
c1,c2,c3…cq check bits. Then this code vector can be written as,

X=(m1,m2,m3,…mkc1,c2,c3…cq)

Here q=n-k

Whereq are the number of redundant bits added by the encoder.

Code vector can also be written as

X=(M/C)

Where M= k-bit message vector

C= q-bit check vector

The main aim of linear block code is to generate check bits and this check bits are
mainly used for error detection and correction.

Example :

The (7, 4) linear code has the following matrix as a generator matrix

If u = (1 1 0 1) is the message to be encoded, its corresponding code word would be

A linear systematic (n, k) code is completely specified by ak × n matrix G of the

following form

2
Let u = (u0, u1, … , uk-1) be the message to be encoded.The corresponding code word
is

The components of v are

The n – k equations given by above equation are called parity-check equations of the
code

Example for Codeword

The matrix G given by

Let u = (u0, u1, u2, u3) be the message to be encoded and v = (v0, v1, v2, v3, v4, v5,v6) be
the corresponding code word

3
Solution :

By matrix multiplication, the digits of the code word v can be determined.

If the generator matrix of an (n, k) linear code is in systematic form, the parity-check
matrix may take the following form

Encoding circuit for a linear systematic (n,k) code is shown below.

4
Figure: Encoding Circuit

For the block of k=4 message bits, (n-k) parity bits or check bits are added. Hence
the total bits at the output of channel encoder are n=7. The encoding circuit for (7, 4)
systematic code is shown below.

Figure: Encoding Circuit for (7,4) code

Syndrome and Error Detection

Let v = (v0, v1, …, vn-1) be a code word that was transmitted over a noisy channel. Let
r = (r0, r1, …, rn-1) be the received vector at the outputof the channel

5
Where

e = r + v = (e0, e1, …, en-1) is an n-tuple and the n-tuple ‘e’ is called the
error vector (or error pattern).The condition is
ei = 1 for ri ≠ vi
ei = 0 for ri = vi

Upon receiving r, the decoder must first determine whether r contains transmission
errors. If the presence of errors is detected, the decoder will take actions to locate the errors,
correct errors (FEC) and request for a retransmission of v.
When r is received, the decoder computes the following (n – k)-tuple.
s = r • HT
s = (s0, s1, …, sn-k-1)

where s is called the syndrome of r.

The syndrome is not a function of the transmitted codeword but a function of error
pattern. So we can construct only a matrix of all possible error patterns with corresponding
syndrome.

When s = 0, if and only if r is a code word and hence receiver accepts r as the
transmitted code word. When s≠ 0, if and only if r is not a code word and hence the presence
of errors has been detected. When the error pattern e is identical to a nonzero code word (i.e.,
r contain errors but s = r • HT = 0), error patterns of this kind are called undetectable error
patterns. Since there are 2k – 1 non-zero code words, there are 2k – 1 undetectable error
patterns. The syndrome digits are as follows:
s0 = r0 + rn-k p00 + rn-k+1 p10 + ··· + rn-1 pk-1,0
s1 = r1 + rn-k p01 + rn-k+1 p11 + ··· + rn-1 pk-1,1

.
sn-k-1 = rn-k-1 + rn-k p0,n-k-1 + rn-k+1 p1,n-k-1 + ··· + rn-1 pk-1,n-k-1

The syndrome s is the vector sum of the received parity digits (r0,r1,…,rn-k-1) and the parity-
check digits recomputed from the received information digits (rn-k,rn-k+1,…,rn-1).

The below figure shows the syndrome circuit for a linear systematic (n, k) code.

6
Figure: Syndrome Circuit

Error detection and error correction capabilities of linear block codes:

If the minimum distance of a block code C is dmin, any two distinct code vector of C
differ in at least dmin places. A block code with minimum distance dmin is capable of detecting
all the error pattern of dmin– 1 or fewer errors.
However, it cannot detect all the error pattern of dmin errors because there exists at least
one pair of code vectors that differ in dmin places and there is an error pattern of dmin errors
that will carry one into the other. The random-error-detecting capability of a block code with
minimum distance dmin is dmin– 1.

An (n, k) linear code is capable of detecting 2n – 2k error patterns of length n

Among the 2n – 1 possible non zero error patterns, there are 2k – 1 error patterns that are
identical to the 2k – 1 non zero code words. If any of these 2k – 1 error patterns occurs, it
alters the transmitted code word v into another code word w, thus w will be received and its
syndrome is zero.

If an error pattern is not identical to a nonzero code word, the received vector r will
not be a code word and the syndrome will not be zero.

Hamming Codes:

These codes and their variations have been widely used for error control
in digital communication and data storage systems.

For any positive integer m ≥ 3, there exists a Hamming code with the following parameters:
Code length: n = 2m – 1
Number of information symbols: k = 2m – m – 1
Number of parity-check symbols: n – k = m
Error-correcting capability: t = 1(dmin= 3)

7
The parity-check matrix H of this code consists of all the non zero m-tuple as its columns
(2m-1)

In systematic form, the columns of H are arranged in the following form

H = [Im Q]
where Im is an m × m identity matrix
The sub matrix Q consists of 2m – m – 1 columns which are the m-tuples of weight 2 or
more. The columns of Q may be arranged in any order without affecting the distance property
and weight distribution of the code.

In systematic form, the generator matrix of the code is

G = [QT I2m–m–1]
where QT is the transpose of Q and I 2m–m–1 is an (2m – m – 1) ×(2m – m – 1)
identity matrix.
Since the columns of H are nonzero and distinct, no two columns add to zero. Since H
consists of all the nonzero m-tuples as its columns, the vector sum of any two columns, say hi
and hj, must also be a column in H, say hlhi+ hj+ hl = 0.The minimum distance of a Hamming
code is exactly 3.

Using H' as a parity-check matrix, a shortened Hamming code can be obtained with
the following parameters :
Code length: n = 2m – l – 1
Number of information symbols: k = 2m – m – l – 1
Number of parity-check symbols: n – k = m
Minimum distance : dmin ≥ 3
When a single error occurs during the transmission of a code vector, the resultant
syndrome is nonzero and it contains an odd number of 1’s (e x H’T corresponds to a column
in H’).When double errors occurs, the syndrome is nonzero, but it contains even number of
1’s.
Decoding can be accomplished in the following manner:
i) If the syndrome s is zero, we assume that no error occurred
ii) If s is nonzero and it contains odd number of 1’s, assume that a single error
occurred. The error pattern of a single error that corresponds to s is added to the received
vector for error correction.
iii) If s is nonzero and it contains even number of 1’s, an uncorrectable error
pattern has been detected.

8
Problems:

9
Binary Cyclic codes:
Cyclic codes are the sub class of linear block codes.

Cyclic codes can be in systematic or non systematic form.

Definition:

A linear code is called a cyclic code if every cyclic shift of the code vector produces
some other code vector.

Properties of cyclic codes:

(i) Linearity (ii) Cyclic

Linearity: This property states that sum of any two code words is also a valid code word.

X1+X2=X3

Cyclic: Every cyclic shift of valid code vector produces another valid code vector.

Consider an n-bit code vector

X = {xn-1,xn-2,.........................x1,x0}

Here xn-1, xn-2 ….x1, x0 represent individual bits of the code vector ‘X’.

If the above code vector is cyclically shifted to left side i.e., One cyclic shift of X gives,

X’= {xn-2 ….x1, x0, xn-1}

Every bit is shifted to left by one position.

Algebraic Structures of Cyclic Codes:

The code words can be represented by a polynomial. For example consider the n-bit code
word X = {xn-1,xn-2, ........................ x1,x0}.

10
This code word can be represented by a polynomial of degree less than or equal to (n-1)
i.e.,

X(p)=xn-1pn-1+xn-2pn-2+....................... +x1p+x0

Here X(p) is the polynomial of degree (n-1)

p- Arbitrary variable of the polynomial

The power of p represents the positions of the codeword bits i.e.,

pn-1 – MSB

p0 -- LSB

p -- Second bit from LSB side

Polynomial representation due to the following reasons

(i) These are algebraic codes, algebraic operations such as addition,

multiplication, division, subtraction etc becomes very simple.
(ii) Positions of the bits are represented with help of powers of p in a
polynomial.

Generation of code words in Non-systematic form:

Let M= {mk-1, mk-2, ........................ m1,m0} be ‘k’ bits of message vector. Then it can be
represented by the polynomial as,

M(p)=mk-1pk-1+mk-2pk-2+ ....................... +m1p+m0

Let X(p) be the code word polynomial

X(p)=M(p)G(p)

G(p) is the generating polynomial of degree ‘q’

For (n,k) cyclic codes, q=n-k represent the number of parity bits.

The generating polynomial is given as

G(p)= pq+gq-1pq-1+ ................. +g1p+1

Where gq-1, gq-2, ............................... g1 are the parity bits.

11
If M1, M2, M3....................................etc are the other message vectors, then the corresponding
code vectors can be calculated as

X1(p) =M1 (p) G (p)

X2(p) =M2 (p) G (p)

X3(p) =M3 (p) G (p)

Generation of Code vectors in systematic form:

X = (k message bits : (n-k) check bits) = (mk-1,mk-2, ..................... m1,m0 : cq-1,cq-

2, ................ c1,c0)

C (p) = cq-1pq-1+cq-2pq-2+.................. +c1p+c0

The check bit polynomial is obtained by

𝑞
𝑀(𝑝)
C(p)= rem [ 𝑝 ]
𝐺(𝑝)

Generator and Parity Check Matrices of cyclic codes:

Non systematic form of generator matrix:

Since cyclic codes are sub class of linear block codes, generator and parity check matrices
can also be defined for cyclic codes.

The generator matrix has the size of k x n.

Let generator polynomial given by equation

G(p)= pq+gq-1pq-1+ ................. +g1p+1

Multiply both sides of this polynomial by pi i.e.,

pi G(p) = pi+q+gq-1pi+q-1…………….+g1pi+1+pi and i=(k-1),(k-2), ..................2,1,0

Systematic form of generator matrix:

Systematic form of generator matrix is given by

G= [Ik : Pkxq]kxn

The tth row of this matrix will be represented in the polynomial form as follows

tth row of G = pn-t + Rt(p)

Where t= 1, 2, 3 ................ k

12
Lets divide pn-t by a generator matrix G(p). Then we express the result of this division in
terms of quotient and remainder i.e.,

𝑝𝑛−𝑡 𝑅𝑒𝑚𝑎𝑖𝑛𝑑𝑒𝑟
= 𝑄𝑢𝑜𝑡𝑖𝑒𝑛𝑡 +
𝐺(𝑝) 𝐺(𝑝)

Here remainder will be a polynomial of degree less than q, since the degree of G(p) is ‘q’.

The degree of quotient will depend upon value of t

Lets represent Remainder = Rt(p)

Quotient = Qt(p)

𝑝𝑛−𝑡 = 𝑄 (𝑝) + 𝑅𝑡(𝑝)

𝐺(𝑝) 𝑡 𝐺(𝑝)

𝑝𝑛−𝑡 = 𝑄𝑡(𝑝)𝐺(𝑝) + 𝑅𝑡(𝑝)

And t= 1,2, ..................... k

𝑝𝑛−𝑡 + 𝑅𝑡(𝑝) = 𝑄𝑡(𝑝)𝐺(𝑝)

Represents tth row of systematic generator matrix

Parity check matrix H = [PT : Iq]qxn

Encoding using an (n-k) Bit Shift Register:

The feedback switch is first closed. The output switch is connected to message input.
All the shift registers are initialized to zero state. The ‘k’ message bits are shifted to the
transmitter as well as shifted to the registers.

13
After the shift of ‘k’ message bits the registers contain ‘q’ check bits. The feedback
switch is now opened and output switch is connected to check bits position. With the every
shift, the check bits are then shifted to the transmitter.

The block diagram performs the division operation and generates the remainder.
Remainder is stored in the shift register after all message bits are shifted out.

Syndrome Decoding, Error Detection and Error Correction:

In cyclic codes also during transmission some errors may occur. Syndrome decoding can
be used to correct those errors.

Lets represent the received code vector by Y.

If ‘E’ represents the error vector then the correct code vector can be obtained as

X=Y+E or Y=X+E

In the polynomial form we can write above equation as

Y(p) = X(p)+E(p)

X(p) = M(p)G(p)

Y(p)= M(p)G(p) + E(p)

𝑌(𝑝) 𝑅𝑒𝑚𝑎𝑖𝑛𝑑𝑒𝑟
= 𝑄𝑢𝑜𝑡𝑖𝑒𝑛𝑡 +
𝐺(𝑝) 𝐺(𝑝)

If Y(p)=X(p)
𝑋(𝑝) 𝑅𝑒𝑚𝑎𝑖𝑛𝑑𝑒𝑟
= 𝑄𝑢𝑜𝑡𝑖𝑒𝑛𝑡 +
𝐺(𝑝) 𝐺(𝑝)

𝑌(𝑝) 𝑅(𝑝)
= 𝑄(𝑝) +
𝐺(𝑝) 𝐺(𝑝)

Y(p)=Q(p)G(p) + R(p)

Clearly R(p) will be the polynomial of degree less than or equal to q-1

Y (p) =Q (p) G (p) +R (p)

M(p)G(p)+E(p)=Q(p)G(p)+R(p)

E(p)=M(p)G(p)+Q(p)G(p)+ R(p)

14
E(p)=[M(p)+Q(p)]G(p)+R(p)

This equation shows that for a fixed message vector and generator polynomial, an
error pattern or error vector ‘E’ depends on remainder R.

For every remainder ‘R’ there will be specific error vector. Therefore we can call the
remainder vector ‘R’ as syndrome vector ‘S’, or R(p)=S(p). Therefore
𝑌(𝑝) 𝑆(𝑝)
= 𝑄(𝑝) +
𝐺(𝑝) 𝐺(𝑝)

Thus Syndrome vector is obtained by dividing received vector Y (p) by G (p) i.e.,
𝑌(𝑝)
𝑆(𝑝) = 𝑟𝑒𝑚[ ]
𝐺(𝑝)

Block Diagram of Syndrome Calculator:

There are ‘q’ stage shift register to generate ‘q’ bit syndrome vector. Initially all the
shift register contents are zero & the switch is closed in position 1.

The received vector Y is shifted bit by bit into the shift register. The contents of flip
flops keep changing according to input bits of Y and values of g1,g2 etc.

After all the bits of Y are shifted, the ‘q’ flip flops of shift register contain the q bit
syndrome vector. The switch is then closed to position 2 & clocks are applied to shift register.
The output is a syndrome vector S= (Sq-1, Sq-2 ….S1, S0)

Decoder of Cyclic Codes:

Once the syndrome is calculated, then an error pattern is detected for that particular
syndrome. When the error vector is added to the received code vector Y, then it gives
corrected code vector at the output.

15
The switch named Sout is opened and Sin is closed. The bits of the received vector Y
are shifted into the buffer register as well as they are shifted in to the syndrome calculator.
When all the n bits of the received vector Y are shifted into the buffer register and Syndrome
calculator the syndrome register holds a syndrome vector.

Syndrome vector is given to the error pattern detector. A particular syndrome detects
a specific error pattern.

Sin is opened and Sout is closed. Shifts are then applied to the flip flop of buffer
registers, error register, and syndrome register.

The error pattern is then added bit by bit to the received vector. The output is the
corrected error free vector.

16
Convolution codes

1
2
3
4
5
6
7
8
Decoding methods of Convolution code:

1.Veterbi decoding

2.Sequential decoding

3.Feedback decoding

Veterbi algorithm for decoding of convolution codes(maximam likelihood decoding):

Let represent the received signal by y.

Convolutional encoding operates continuously on input data

Hence there areno code vectorsand blocks such as.

Metric:it is the discrepancybetwen the received signal y and the decoding signal at
particular node .this metric can be added over few nodes a particular path

Surviving path: this is the path of the decoded signalwith minimum metric

In veterbi decoding ametric isassigned to each surviving path

Metric of the particular is obtained by adding individual metric on the nodes along that
path.

Y is decoded as the surviving path with smallest metric.

Example:

9
Exe:

CH 1
No ratings yet
CH 1
21 pages
Information Theory 5th Unit
No ratings yet
Information Theory 5th Unit
20 pages
Information Theory and Coding - Chapter 2
0% (1)
Information Theory and Coding - Chapter 2
41 pages
C&C Combined Module Notes
No ratings yet
C&C Combined Module Notes
206 pages
21ECE72 - Coding and Cryp Module 1
No ratings yet
21ECE72 - Coding and Cryp Module 1
34 pages
Lec3 Source Coding Annotated Day4
No ratings yet
Lec3 Source Coding Annotated Day4
75 pages
Lecture01 02 Part1
No ratings yet
Lecture01 02 Part1
27 pages
Information Theory
No ratings yet
Information Theory
38 pages
Itc Term1
No ratings yet
Itc Term1
78 pages
Unit 4 - DC - 2023-2024
No ratings yet
Unit 4 - DC - 2023-2024
100 pages
Information Theory
No ratings yet
Information Theory
108 pages
Module 1
No ratings yet
Module 1
29 pages
ICT - Module 1 Lecture 1
No ratings yet
ICT - Module 1 Lecture 1
34 pages
Ec23ec4211itc PPT
No ratings yet
Ec23ec4211itc PPT
148 pages
Information Theory Final
No ratings yet
Information Theory Final
50 pages
Chapter 1 (A)
No ratings yet
Chapter 1 (A)
30 pages
Chapter 2 - Edited
No ratings yet
Chapter 2 - Edited
45 pages
Unit 1
No ratings yet
Unit 1
94 pages
CE Notes
No ratings yet
CE Notes
32 pages
Information Theory and Coding
No ratings yet
Information Theory and Coding
12 pages
Information Theory: Prepared By: Amit Degada Teaching Assistant, ECED, NIT Surat
No ratings yet
Information Theory: Prepared By: Amit Degada Teaching Assistant, ECED, NIT Surat
30 pages
Unit IV - Information Theory
No ratings yet
Unit IV - Information Theory
17 pages
Information Theory and Coding NOTES
No ratings yet
Information Theory and Coding NOTES
129 pages
Information Theory
No ratings yet
Information Theory
26 pages
Information Coding Techniques
No ratings yet
Information Coding Techniques
42 pages
Intro Lecture Notes
No ratings yet
Intro Lecture Notes
15 pages
Lec35 - 210108062 - ZAINAB ALI
No ratings yet
Lec35 - 210108062 - ZAINAB ALI
9 pages
All Coding
No ratings yet
All Coding
52 pages
Unit 1 ITC
No ratings yet
Unit 1 ITC
25 pages
Information Theory and Coding (Lecture 1) : Dr. Farman Ullah
No ratings yet
Information Theory and Coding (Lecture 1) : Dr. Farman Ullah
32 pages
Digital Communication Intro2
No ratings yet
Digital Communication Intro2
2 pages
Lecture 2
No ratings yet
Lecture 2
22 pages
4 20240 456
0% (1)
4 20240 456
5 pages
Information Theory and Coding PDF
No ratings yet
Information Theory and Coding PDF
61 pages
ECT305: Analog and Digital Communication Module 2, Part 3: DR - Susan Dominic Assistant Professor Dept. of ECE Rset
No ratings yet
ECT305: Analog and Digital Communication Module 2, Part 3: DR - Susan Dominic Assistant Professor Dept. of ECE Rset
21 pages
ITC Module - I
No ratings yet
ITC Module - I
98 pages
Module 1
No ratings yet
Module 1
40 pages
Pres 3may 5may 9871
No ratings yet
Pres 3may 5may 9871
11 pages
Infotheory&Coding BJS Compiled
No ratings yet
Infotheory&Coding BJS Compiled
91 pages
Amount of Information I Log (1/P)
No ratings yet
Amount of Information I Log (1/P)
2 pages
Information Theory
No ratings yet
Information Theory
26 pages
Measure of Information
No ratings yet
Measure of Information
92 pages
Lec2 - Data Compression PDF
No ratings yet
Lec2 - Data Compression PDF
9 pages
Chapter 6
No ratings yet
Chapter 6
34 pages
Lecture 3 - Mutual Information. Source Coding and Channel Coding 2
No ratings yet
Lecture 3 - Mutual Information. Source Coding and Channel Coding 2
23 pages
Channel Coding Theorem
No ratings yet
Channel Coding Theorem
23 pages
Discrete Memoryless Source Final
No ratings yet
Discrete Memoryless Source Final
34 pages
FTIC Chapter 1
No ratings yet
FTIC Chapter 1
11 pages
Lecture 2
No ratings yet
Lecture 2
55 pages
Information Theory PDF
No ratings yet
Information Theory PDF
26 pages
Lesson 4 Information Theory
No ratings yet
Lesson 4 Information Theory
39 pages
Data Compression: Reference: Proakis Salehi (II Ed.) Cap.4
No ratings yet
Data Compression: Reference: Proakis Salehi (II Ed.) Cap.4
30 pages
ch3 PDF
No ratings yet
ch3 PDF
70 pages
Cse3086 Itc Notes 30 Oct
No ratings yet
Cse3086 Itc Notes 30 Oct
50 pages
Information Theory and Coding
No ratings yet
Information Theory and Coding
84 pages
ITC Module1
No ratings yet
ITC Module1
31 pages
Information Theory
No ratings yet
Information Theory
37 pages
Part 3 Information and Quantification
No ratings yet
Part 3 Information and Quantification
28 pages
ECM3701 Study Unit 8
No ratings yet
ECM3701 Study Unit 8
20 pages
Mathematical Foundations of Information Theory
From Everand
Mathematical Foundations of Information Theory
A. Ya. Khinchin
3.5/5 (9)
Lectures on Boolean Algebras
From Everand
Lectures on Boolean Algebras
Paul R. Halmos
4/5 (2)
1.1 0654 P2 Physics Motion Set 3 QP
No ratings yet
1.1 0654 P2 Physics Motion Set 3 QP
4 pages
Pythonintroin Your Cs0160Directory. It Should Contain Two
No ratings yet
Pythonintroin Your Cs0160Directory. It Should Contain Two
36 pages
Plotting Motor Starting Curve On TCC
No ratings yet
Plotting Motor Starting Curve On TCC
10 pages
Parent Styles Associated With Children's Self-Regulation and
No ratings yet
Parent Styles Associated With Children's Self-Regulation and
13 pages
Ocean Engineering: Jane Cullum, Jonathan Binns, Michael Lonsdale, Rouzbeh Abbassi, Vikram Garaniya
No ratings yet
Ocean Engineering: Jane Cullum, Jonathan Binns, Michael Lonsdale, Rouzbeh Abbassi, Vikram Garaniya
10 pages
Design, Analysis &optimization of Crankshaft Using CAE
No ratings yet
Design, Analysis &optimization of Crankshaft Using CAE
6 pages
WBJEE 2020 Maths Question Answerkey Solutions
No ratings yet
WBJEE 2020 Maths Question Answerkey Solutions
58 pages
Trigonometry Word Problems Practice - MathBitsNotebook (Geo)
No ratings yet
Trigonometry Word Problems Practice - MathBitsNotebook (Geo)
1 page
Boxtra
No ratings yet
Boxtra
3 pages
Code2pdf 6400c76826c9d
No ratings yet
Code2pdf 6400c76826c9d
3 pages
Pesco
No ratings yet
Pesco
6 pages
Gcse Matheamtics Paper 1 (N - C) : Pre-Public Examinations
No ratings yet
Gcse Matheamtics Paper 1 (N - C) : Pre-Public Examinations
23 pages
The Philippine Flag and Its Symbols
100% (1)
The Philippine Flag and Its Symbols
3 pages
Speed and Acceleration
No ratings yet
Speed and Acceleration
4 pages
SL Prior Learning Test
No ratings yet
SL Prior Learning Test
15 pages
Data Structures 2
No ratings yet
Data Structures 2
1 page
AV121 DSA KJ Lecture 02 Binary Trees N Ary Trees
No ratings yet
AV121 DSA KJ Lecture 02 Binary Trees N Ary Trees
54 pages
Configuration of Fibers in Staple Yarn
No ratings yet
Configuration of Fibers in Staple Yarn
8 pages
Logarithm DPP 2
No ratings yet
Logarithm DPP 2
2 pages
Janelid1966 PDF
No ratings yet
Janelid1966 PDF
36 pages
2015 James Ruse Agricultural HS Ext 2
No ratings yet
2015 James Ruse Agricultural HS Ext 2
26 pages
App - El & Cap
No ratings yet
App - El & Cap
11 pages
1954 - Application of The Rayleigh Ritz Method To Variational Problem by Indritz
No ratings yet
1954 - Application of The Rayleigh Ritz Method To Variational Problem by Indritz
37 pages
Maclaurin Series From OCR Exam Questions
No ratings yet
Maclaurin Series From OCR Exam Questions
37 pages
Nondestructive Testing and Evaluation
No ratings yet
Nondestructive Testing and Evaluation
14 pages
Introduction To Binary Student Worksheets
No ratings yet
Introduction To Binary Student Worksheets
12 pages
Inequality Cheat Sheet
No ratings yet
Inequality Cheat Sheet
4 pages
Ba Reviewer
No ratings yet
Ba Reviewer
7 pages
Fourier Transform For Signals On Dynamic Graphs
No ratings yet
Fourier Transform For Signals On Dynamic Graphs
12 pages

Information Theory and Source Coding

Uploaded by

Information Theory and Source Coding

Uploaded by

Information Theory & Source Coding

Information theory deals with representation and the transfer of information.

Where ‘ r ’ is the number of all possible outcome so far an do m message U.

Classification of information sources

1. Discrete memory less.

1. Determine the irreducible limit below which a signal cannot be compressed.

S = {s0, s1, s2, ・ ・・, sk−1}

P(S = sk) = pk where k = 0, 1, 2, ・ ・・, k − 1

The following assumptions are made about the source

1. Source generates symbols that are statistically independent.

1. Information conveyed by a deterministic event is nothing

I (U) = log b = - log b P(U)

Where P (U) is the probability of occurrence of symbol U

b = e (ln): nat (natural logarithm),

The conversation of these units to other units is given as

Uncertainty or Entropy (i.e Average information)

The uncertainty or entropy of a discrete random variable (RV) ‘U’ is deﬁned as

Entropy for binary source

H (u) = - log2 - log2 = 1 b/symbol

If U has r possible values, then 0 ≤ H(U) ≤ log r,

H(U)=0 if, and only if, PU(u)=1 for some u,

H(U)=log r if, and only if, PU(u)= 1/r ∀ u.

Hence, H(U) ≥ 0.Equalitycanonlybeachievedif −PU(u)log2 PU(u)=0

For all u ∈ supp (PU), i.e., PU (u) =1forall u ∈ supp (PU).

1. In the IT Inequality ξ =1,i.e.,if 1r·PU(u)=1=⇒ PU(u)= 1r ,for all u;

Note that if Condition1 is satisﬁed, Condition 2 is also satisﬁed.

Mutual information is a measure of the reduction of randomness of a variable given

I(X,Y) = H(X)+H(Y)–H(X,Y) = I(Y,X)

In addition to the definitions above, it is useful to realize that mutual information is a

In terms of KL divergence, mutual information is:

Independent of each other. A fixed probability assignment for the occurrence of

a1: 00, a2:01, a3:10, a4:11

a1: 0, a2:10, a3:001, a4:110

So a source-encoding scheme should ensure that

Source Symbol Pi Binary Code Shannon-Fano

Steps to calculate Huffman Coding

P(a1)=0.3,P(a2)=0.2, P(a 3)=0.15, P(a 4)=0.15, P(a5)=0.12 and P(a6)=0.08.

a4 and b1. Our example results in:

So, P(b3)=0.4. Following Step#2 again, we get, P(b3)=0.4, P(a1)=0.3 and

With P(b4) = 0.6. Finally we get only two probabilities

1. The event with maximum probability has least number of bits

2. Prefix condition is satisfied. No representation of one letter is prefix for other.

3. Average length/letter (in bits) after coding is

= ∑P (ai )ni = 2.5 bits/letter.

Hartley Shannon Law

1. As the bandwidth of the channel increases, it is possible to make faster

3. For no noise, S/N tends to infinity and an infinite information rate is

Principle of block coding

Message block Code block

Message Message Check bits

Figure: Functional block diagram of block coder

Linear Block Codes

Whereq are the number of redundant bits added by the encoder.

Code vector can also be written as

Where M= k-bit message vector

C= q-bit check vector

If u = (1 1 0 1) is the message to be encoded, its corresponding code word would be

A linear systematic (n, k) code is completely specified by ak × n matrix G of the

The components of v are

Example for Codeword

The matrix G given by

By matrix multiplication, the digits of the code word v can be determined.

Encoding circuit for a linear systematic (n,k) code is shown below.

Figure: Encoding Circuit for (7,4) code

Syndrome and Error Detection

where s is called the syndrome of r.

Error detection and error correction capabilities of linear block codes:

An (n, k) linear code is capable of detecting 2n – 2k error patterns of length n

In systematic form, the columns of H are arranged in the following form

In systematic form, the generator matrix of the code is

Cyclic codes can be in systematic or non systematic form.

Properties of cyclic codes:

(i) Linearity (ii) Cyclic

Consider an n-bit code vector

X’= {xn-2 ….x1, x0, xn-1}

Algebraic Structures of Cyclic Codes:

S = {s0, s1, s2, ・・・, sk−1}

P(S = sk) = pk where k = 0, 1, 2, ・・・, k − 1