Information Theory and Coding
Information Theory and Coding
It deals with mathematical modelling and analysis of a communication system rather than dealing with
physical sources (camera, keyboard, microphone etc.) and physical channels (wires, cables, fibre, satellite, radio etc).
Note: Lot of mathematics is involved in manufacturing of things we use in daily-life… like, manufacturing of pen,
paper, flash memory, mobiles, antenna design, garment manufacturing ete etc etc
Information theory addresses and answers below fundamental questions of communication theory:
Data compression
Data transmission
Data storage
Error detection and correction codes
Ultimate Data rate over a noisy channel (for reliable communication)
Every morning one reads newspaper to receive information. A message is said to convey information, if two key
elements are present in it.
2 Gopal will get Nobel prize in physics This message will contain lot of information. Because it
is not frequent. It is a rare event, so lot of information.
3 Scattered rain Rare event, so lot of information
4 Cyclone storm Rare event, so lot of information
5 Son is born to his wife who is a mother of Rare event, so lot of information
two daughters
6 COVID-19 pandemic disease Rare event, so lot of information
The messages in the table above have different probabilities of occurrence and hence contain different value of
information.
Information sources can be classified as either memory or memory-less. A memory-less source is one for which each
symbol produced is independent of previous symbols. A memory source is one for which a current symbol depends on
the previous symbols.
An information source is an object that produces an event. A practical source in a communication system is a device
that produces messages. It can be either alalog or digital (discrete). Analog sources can be transformed into discrete
sources through the use of sampling and quantization techniques.
A discrete information source is a source that has only a finite set of symbols as possible outputs.
Let X having alphabets {x1, x2,…..xm}. Note that set of source symbols is called source alphabet.
A DMS is described by the list of symbols, probability assignment to these symbols and the specification of rate of
generating these symbols by source. A discrete information source consists of a discrete set of letters or symbols. In
general, any message emitted by the source consists of a string or sequence of symbols.
Example:
A source is one, which generates sequence of symbols called ‘alphabet’. Below table shows various information
sources and the type of messages.
Measure of information
The amount of information in a message depends only upon the uncertainty of the event. Amount of information
received from the knowledge of occurrence of an event may be related to the probability of occurrence of that event.
Few messages received from different sources will contain more information than others. Information should be
proportional to the uncertainty (doubtfulness) of an outcome.
According to Hartley, information in a message is a logarithmic function. Let m1 is any message, then information
content of this message is given as:
1
𝐼(𝑚1 ) = log 2
𝑝(𝑚1 )
Information must be a non-negative quantity since each message contains some information.
1
𝐼(𝑥𝑖 ) = log 2
𝑃(𝑥𝑖 )
Properties of I(xi)
i. I(x1) = 0 if p(x1) =1
ii. I(x1) ≥ 0
iii. I(x1, x2) = I(x1) + I(x1) if x1 and x1 are independent
iv. I(x1 ) > I(x2) if p(x1) < p(x2)
Q: A source generates one of the 5 possible messages during each message interval. The probabilities of these
messages are:
1 1 1 1 1
𝑝1 = , 𝑝2 = , 𝑝3 = , 𝑝4 = , 𝑝5 = . Find the information content of each message.
2 16 8 4 16
1
𝐼(𝑚1 ) = log 2 ( ) = log 2 2 = 1 𝑏𝑖𝑡 Information in message m1
𝑝1
1
𝐼(𝑚2 ) = log 2 (𝑝 ) = log 2 16 = 4 𝑏𝑖𝑡𝑠 Information in message m2
2
1
𝐼(𝑚3 ) = log 2 (𝑝 ) = log 2 8 = 3 𝑏𝑖𝑡 𝑠 Information in message m3
3
1
𝐼(𝑚4 ) = log 2 (𝑝 ) = log 2 4 = 2 𝑏𝑖𝑡 𝑠 Information in message m4
4
1
𝐼(𝑚5 ) = log 2 (𝑝 ) = log 2 16 = 4 𝑏𝑖𝑡 𝑠 Information in message m5
5
Q: In a binary PCM if 0 occurs with probability ¼ and 1 occurs with probability ¾, then calculate the amount of
information carried by each bit.
1 1
𝐼(𝑏𝑖𝑡 0) = log 2 = log 2 = log 2 4 = log 2 22 = 2 log 2 2 = 2 ∗ 1 = 2 𝑏𝑖𝑡𝑠
𝑃(𝑥1 ) 1
4
1 1 4 log10 1.33 0.125
𝐼(𝑏𝑖𝑡 1) = log 2 = log 2 = log 2 = log 2 1.33 = = = 0.415 𝑏𝑖𝑡𝑠
𝑃(𝑥2 ) 3 3 log10 2 0.301
4
Bit 0 has probability ¼ and it has 2 bits of information
Bit 1 has probability ¾ and it has 0.415 bits of information
Symbol = message
1
𝐼(𝑥1 ) = log 2
𝑝(𝑥1 )
1
𝐼(𝑥2 ) = log 2
𝑝(𝑥2 )
Thus proved.
Information contained in independent outcomes should add. So, Information given by two independent messages is
the sum of the information contained in individual messages.
ENTROPY
1
𝐻(𝑋) = ∑𝑙1 𝑝(𝑥𝑖 ) 𝐼(𝑥𝑖 ) = ∑𝑙1 𝑝(𝑥𝑖 ) log 2 Where X = source and l = size of alphabet
𝑝(𝑥𝑖 )
Q: A sample space of 5 messages with probabilities are given as: P(s) = {0.25, 0.25, 0.25, 0.125, 0.125}. Find
Entropy of the source.
5
1 1 1 1 1 1
𝐻(𝑋) = ∑ 𝑃(𝑥𝑖 ) log 2 = 𝑃(𝑥1 ) log 2 + 𝑃(𝑥2 ) log 2 + 𝑃(𝑥3 ) log 2 + 𝑃(𝑥4 ) log 2 + 𝑃(𝑥5 ) log 2
𝑃(𝑥𝑖 ) 𝑃(𝑥1 ) 𝑃(𝑥2 ) 𝑃(𝑥3 ) 𝑃(𝑥4 ) 𝑃(𝑥5 )
1
1 1 1 1 1
= 0.25log 2 0.25 + 0.25log 2
0.25
+ 0.25log 2
0.25
+ 0.125log 2
0.125
+ 0.125log 2
0.125
5|Page Youtube.com/ EngineersTutor www.EngineersTutor.com
1 1
= 3 x 0.25log2 0.25 + 2 𝑥 0.125log2 0.125
= 3 𝑥 0.25 𝑥 log 2 4 + 2 𝑥 0.125 𝑥 log 2 8
= 3 𝑥 0.25 𝑥 2 + 2 𝑥 0.125 𝑥 4
= 2.25 bits/ symbol
Q: An unfair dice with four faces and p(1) = 1/2, p(2) = 1/4, p(3) = p(4) = 1/8. Find entropy H (answer= 7/4).
Q: A sample space p(s) = {0.5, 0.125, 0.125, 0.125, 0.125}. Find Entropy of the source.
Q: A sample space p(s) = {0.75, 0.0625, 0.0625, 0.0625, 0.0626}. Find Entropy of the source.
Q: Find the entropy of a binary source
If X and Y are independent sources, then joint probability p(x, y) = p(x) . p(y)
𝑝(𝑥,𝑦) 𝑝(𝑥).𝑝(,𝑦)
𝐼(𝑋; 𝑌) = ∑𝑥,𝑦 𝑝(𝑥, 𝑦) log 2 𝑝(𝑥).𝑝(𝑦) = ∑𝑥,𝑦 𝑝(𝑥, 𝑦) log 2 𝑝(𝑥).𝑝(𝑦) = ∑𝑥,𝑦 𝑝(𝑥, 𝑦) log 2 1 = 0
• The basic goal of a communication system is to transmit some information from source
to the destination.
• Message = Information
• Information consists of letters, digits, symbols, sequence of letters, digits, symbols etc.
• Information theory gives an idea about what can be achieved or what cannot be achieved -in a communication
system.
A practical source in a communication system is a device that produces messages. It can be either alalog or digital
(discrete). So, there is a source and there is a destination. Messages are transferred from one point to another.
We deal mainly with discrete sources since analog sources can be transformed to discrete sources through the use of
sampling and quantization techniques.
Coding theory
Coding is the most important application of information theory. DMS output is converted into binary codes. Device
that performs this conversion is called the source encoder.
Various source coding techniques are Huffman coding, Shannon-Fano, Lempel Ziv coding, PCM, DPCM, DM and
adaptive DM (ADPCM).
Assume that 1-bit of information is transmitted from source to destination. If bit – 0 is transmitted, bit – 0 must be
received.
Transmitter Receiver Remark
0 0 Good
0 1 Error
1 0 Error
1 1 Good
Error correcting code adds just the right kind of redundancy as possible (i.e., error correction) needed to transmit
the data efficiently across noisy channel.
Source coding: Also known as entropy coding. Here the information generated by the source is compressed. Note
that during compression no significant information is lost. Entropy defines the minimum amount of necessary
information. Source coding is done at transmitter.
Source encoder transforms information from source into different information bits, while implementing data
compression. Various source coding techniques are Huffman coding, Lempel Ziv coding, PCM, DPCM, DM and
adaptive DM (ADPCM).
Channel coding: Also known as error-control coding. Encoder adds additional bits to actual information in order
to detect and correct transmission errors. Channel coding is done at receiver. Various channel coding techniques are:
block coding, convolutional coding, turbo coding etc.
Q: In a binary PCM if 0 occurs with probability ¼ and 1 occurs with probability ¾, then calculate the amount of
information carried by each bit.
1 1
𝐼(𝑏𝑖𝑡 0) = log 2 = log 2 = log 2 4 = log 2 22 = 2 log 2 2 = 2 ∗ 1 = 2 𝑏𝑖𝑡𝑠
𝑃(𝑥1 ) 1
4
1 1 4 log10 1.33 0.125
𝐼(𝑏𝑖𝑡 1) = log 2 = log 2 = log 2 = log 2 1.33 = = = 0.415 𝑏𝑖𝑡𝑠
𝑃(𝑥2 ) 3 3 log10 2 0.301
4
Bit 0 has probability ¼ and it has 2 bits of information
Bit 1 has probability ¾ and it has 0.415 bits of information
1 1 1 1
= 0.4 log 2 + 0.3 log 2 + 0.2 log 2 + 0.1 log 2
0.4 0.3 0.2 0.1
1 1 1 1
= 0.4 log 2 + 0.3 log 2 + 0.2 log 2 + 0.1 log 2 = 1.85 𝑏𝑖𝑡𝑠/𝑠𝑦𝑚𝑏𝑜𝑙
0.4 0.3 0.2 0.1
Q: A source produces one of the 4 possible symbols during each interval having probabilities:
1 1 1
𝑝1 = 2 , 𝑝2 = 4 , 𝑝3 = 𝑝4 = 8. Obtain information content of each of these symbols.
Q: A source emits independent sequences of symbols from a source alphabet containing five
symbols with probabilities 0.4, 0.2, 0.2, 0.1 and 0.1. Compute the entropy of the source.
Substituting we get,
H = - [p1 log p1 + p2 log p2 + p3 log p3 + p4 log p4 + p5 log p5 ]
= - [0.4 log 0.4 + 0.2 log 0.2 + 0.2 log 0.2 + 0.1 log 0.1 + 0.1 log 0.1]
H = 2.12 bits/symbol
Q: An analog signal is band limited to 1000 Hz and sampled at Nyquist rate. The samples are quantized into 4
levels. Each level represents 1 symbol. Thus, there are 4 levels (symbols) are:
p(x1) = p(x4) = 1/8 and p(x2) = p(x3) = 3/8. Obtain information rate of the source.
4
1 1 1 1 1
𝐸𝑛𝑡𝑟𝑜𝑝𝑦, 𝐻(𝑋) = ∑ 𝑃(𝑥𝑖 ) log 2 = 𝑃(𝑥1 ) log 2 + 𝑃(𝑥2 ) log 2 + 𝑃(𝑥3 ) log 2 + 𝑃(𝑥4 ) log 2
𝑃(𝑥𝑖 ) 𝑃(𝑥1 ) 𝑃(𝑥2 ) 𝑃(𝑥3 ) 𝑃(𝑥4 )
1
1 3 8 1 3 8
= log 2 8 + log 2 + log 2 8 + log 2 = 1.5 𝑏𝑖𝑡𝑠/𝑠𝑦𝑚𝑏𝑜𝑙
8 8 3 8 8 3
Information rate (R)
R = r H(X)
Where R = information rate (bps) = rate at which symbols are generated.
r = Number of symbols generated by source (symbols/sec)
H(X) = entropy or average information (bits/symbol)
fm = 1000 Hz (given)
Nyquist rate fs = 2fm = 2000 Hz = 2000 symbols/sec. This is nothing but number of symbols generated by source,
which is r
10 | P a g e Y o u t u b e . c o m / E n g i n e e r s T u t o r www.EngineersTutor.com
11 | P a g e Y o u t u b e . c o m / E n g i n e e r s T u t o r www.EngineersTutor.com
Every communication channel has a speed limit, measured in bps. This famous Shannon limit and the formula
for capacity of communication channel is given as:
Bad news:
Good news:
Below the Shannon limit, it is possible to transmit the information with zero error. Shannon mathematically proved
that use of encoding techniques allow us to reach maximum limit of channel capacity without any errors regardless of
amount of noise.
12 | P a g e Y o u t u b e . c o m / E n g i n e e r s T u t o r www.EngineersTutor.com