0% found this document useful (0 votes)
40 views37 pages

Digital Communication Chapter 3

This document discusses information theory and coding. It defines key concepts like information sources, discrete memoryless sources, source alphabets, probabilities of occurrence, entropy, mutual information, channel capacity, and Shannon's channel capacity formula. An example calculates the maximum rate of information that can be transmitted over an AWGN channel with a given bandwidth and noise power spectral density. Overall, the document provides an introduction to foundational concepts in information theory.

Uploaded by

aarushibawejaji
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views37 pages

Digital Communication Chapter 3

This document discusses information theory and coding. It defines key concepts like information sources, discrete memoryless sources, source alphabets, probabilities of occurrence, entropy, mutual information, channel capacity, and Shannon's channel capacity formula. An example calculates the maximum rate of information that can be transmitted over an AWGN channel with a given bandwidth and noise power spectral density. Overall, the document provides an introduction to foundational concepts in information theory.

Uploaded by

aarushibawejaji
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Digital Communication

Chapter 3
Information Theory and
Coding
Information Source
• Information refers to any new knowledge about something.

• The role of a communication-system designer to make sure that the information


is transmitted to the receiver correctly.

• An information source conducts random experiments producing some events.

• The outcome of the events is picked from a sample space (set of possible
outcomes) of the experiment

• The sample space is also called as source alphabet.

• Each element of the source alphabet is called as symbol having some probability
of occurrence
Discreate Memoryless Source (DMS)
• A source is discreate if the number of symbols present in the alphabet is
discreate/ finite.
• A source is memoryless if the occurrence of one event is independent of
occurrence of previous events.
Ex 1: Binary symmetric source:
• It is described by the alphabet 𝓜 = {0,1}. In this source, there are two
symbols,
𝑋1 = 0 and 𝑋2 = 1

• The probability of occurrence of 𝑋1 is 𝑃(𝑋1 ) and 𝑋2 is 𝑃(𝑋2 )


𝑃(𝑋1 ) + 𝑃(𝑋2 ) = 1
Ex 2. QPSK Transmitter
• It transmits one of the 4 symbols in each symbol duration.

• Here the source alphabet is 𝓜 = {𝑠1 , 𝑠2 , 𝑠3 , 𝑠4 }.

• Corresponding probabilities: 𝑃(𝑠1 ), 𝑃(𝑠2 ), 𝑃(𝑠3 ), 𝑃(𝑠4 ),

• 𝑃(𝑠1 ) + 𝑃(𝑠2 ) + 𝑃(𝑠3 ) + 𝑃(𝑠4 ) = 1,


Measure of Information
• Information is a measure of uncertainty of an outcome
• Lower the probability of occurrences of an outcome, higher the information
contained in it.
• Let a DMS has source alphabet 𝓜 = {𝑠1 , 𝑠2 , 𝑠3 , … . . , 𝑠𝑚 }
• The probability in i th symbol 𝑠𝑖 is 𝑃(𝑠𝑖 )
• Then, information contained in 𝑠𝑖 is

1
𝐼 𝑠𝑖 = log 2 = − log 2 𝑃(𝑠𝑖 ) 𝑏𝑖𝑡𝑠
𝑃(𝑠𝑖 )
Properties of Information

• For 𝑃(𝑠𝑖 ) = 1 (certain event) 𝐼(𝑠𝑖 ) = 0 (no information contains)


• For 𝑃(𝑠𝑖 ) = 0 (nonexistence of the event) 𝐼(𝑠𝑖 ) = ∞ (contains ∞
information)
• 0 < 𝑃(𝑠𝑖 ) < 1 → ∞ > 𝐼(𝑠𝑖 ) > 0

• If 𝑃(𝑠𝑖 ) < 𝑃(𝑠𝑗 ) then 𝐼(𝑠𝑖 ) > 𝐼(𝑠𝑗 ) (Higher the probability lowers the
information)
Ex: A DMS has source alphabet 𝓜 = {𝑠1 , 𝑠2 , 𝑠3 , 𝑠4 } with 𝑃(𝑠1 ) = 0.5,
𝑃(𝑠2 ) = 0.25 and 𝑃(𝑠3 ) = 𝑃(𝑠4 ) = 0.125. Find information contained in
each symbol

Ans:
• 𝑃(𝑠1 ) = 0.5 So 𝐼(𝑠1 ) = − log 2 𝑃(𝑠1 ) = − log 2 0.5 = 1𝑏𝑖𝑡

• 𝑃(𝑠2 ) = 0.25 So 𝐼(𝑠2 ) = − log 2 𝑃(𝑠2 ) = − log 2 0.25 = 2𝑏𝑖𝑡

• 𝑃(𝑠3 ) = 0.125 So 𝐼(𝑠3 ) = − log 2 𝑃(𝑠3 ) = − log 2 0.125 = 4𝑏𝑖𝑡

• 𝑃(𝑠4 ) = 0.125 So 𝐼(𝑠4 ) = − log 2 𝑃(𝑠4 ) = − log 2 0.125 = 4𝑏𝑖𝑡


Average Information (Entropy)
• When an information source is transmitting a long sequence of symbols,

Then average information transmitted from the source is more interesting than
information contained in individual symbols.

• Average information is called entropy of the source, given by:

𝐻 𝑆 = 𝐸 𝐼 𝑠𝑖 = ෍ 𝑃(𝑠𝑖 )𝐼(𝑠𝑖 )
𝑖=1

𝐻 𝑆 = − σ𝑚
𝑖=1 𝑃(𝑠𝑖 ) log 2 𝑃(𝑠𝑖 ) bits/symbol
Ex: Find the entropy of the DMS of the previous problem.

Ans:

𝐻 𝑆 = − σ𝑚
𝑖=1 𝑃(𝑠𝑖 ) log 2 𝑃(𝑠𝑖 )

= 0.5 X 1 + 0.25 X 2 + 0.125 X 4 + 0.125 X 4

= 1.75 bits/symbols

• The average information contained in one symbol transmitted by the source is 1.75bits

• Range of H(S): 0 ≤ 𝐻 𝑆 ≤ log 2 𝑀

𝑀: No of symbols in source alphabet

For 𝑀 = 4 , 𝐻 𝑆 ≤ 2𝑏𝑖𝑡𝑠/𝑠𝑦𝑚𝑏𝑜𝑙
Information Rate (R)
• If the source S is emitting r symbols per second, then the information rate of the symbol is
given by

𝑅 = 𝑟𝐻 𝑆

𝑠𝑦𝑚𝑏𝑜𝑙 𝑏𝑖𝑡𝑠 𝑏𝑖𝑡𝑠


(Unit: × = )
𝑠𝑒𝑐 𝑠𝑦𝑚𝑏𝑜𝑙 𝑠𝑒𝑐

i.e. the source S is emitting R bits/sec

Ex: Find the information rate of the previous problem if the source emits 1000symbols/sec.

Ans:

Information rate: 𝑅 = 𝑟𝐻 𝑆 = 1000 X 1.75 = 1.75kbps


Entropy Function of a binary memoryless source:
• Let S is a binary memoryless channel with H(S) attains maximum value when all
two symbols: 𝑠1 and 𝑠2
𝑃(𝑠1 ) = 𝑝 symbols are equiprobable

𝑃(𝑠2 ) = 1 − 𝑝

• 𝐻 𝑆 = −𝑝 log 2 𝑝 − (1 − 𝑝) log 2 (1 − 𝑝)

𝑑𝐻(𝑆)
=0
𝑑𝑝

1−𝑝
log 2 =0
𝑝

1−𝑝
=1 𝑝 = 1/2
𝑝
Discrete Memoryless Channel (DMC)
• The channel is a path through which the information flows from
transmitter to receiver.

• In each signaling interval, the channel accepts an input signal from source
alphabet (X)
• In response, it generates a symbols from the destination alphabet (Y).

• If the number of symbols presents in X and Y are finite, it is called as


discreate channel.

• If the channel output depends only on present input of the channel, then
channel is memoryless.
Channel Transition Probability
• Each input-output path is represented by its
channel transition probability.
• Transition probability between jth source
symbol 𝑥𝑗 and ith destination symbol 𝑦𝑖 is the
conditional probability 𝑃(𝑦𝑖 Τ𝑥𝑗 ).

• It provides the probability of occurrence of 𝒚𝒊


when 𝒙𝒋 is definitely transmitted.
Channel Transmission Matrix
The complete set of channel transition probabilities forms the channel
transition matrix 𝑃(𝑌Τ𝑋) whose i-j th element is 𝑃(𝑦𝑖 Τ𝑥𝑗 ).
Binary Symmetric Channel (BSC)

Ex: If 𝑃 𝑦1 Τ𝑥1 = 0.9 and 𝑃 𝑦2 Τ𝑥2 = 0.8, find the channel transition matrix.

𝑃 𝑦1 Τ𝑥1 = 0.9 i.e. when 𝑥1 is transmitted then 90% times 𝑦1 is received

𝑃 𝑦2 Τ𝑥1 = 1 − 0.9 = 0.1

𝑃 𝑦2 Τ𝑥2 = 0.8 i.e. when 𝑥2 is transmitted then 80% times 𝑦2 is received

𝑃 𝑦1 Τ𝑥2 = 1 − 0.8 = 0.2

𝑃 𝑦1 Τ𝑥1 𝑃 𝑦2 Τ𝑥1 0.9 0.1


𝑃(𝑌Τ𝑋) = =
𝑃 𝑦1 Τ𝑥2 𝑃 𝑦2 Τ𝑥2 0.2 0.8
• Ex: If 𝑃 𝑥1 = 0.6 and 𝑃 𝑥2 = 0.4, find the probabilities of the destination
symbols for previous channel.
• Ans

𝑃 𝑋 = 𝑃 𝑥1 𝑃 𝑥2 = [0.6 0.4]

0.9 0.1
𝑃 𝑌 = 𝑃 𝑋 𝑃 𝑌Τ𝑋 = 0.6 0.4
0.2 0.8
0.6 × 0.9 + 0.4 × 0.2 0.62
= =
0.6 × 0.1 + 0.4 × 0.8 0.38
So, 𝑃 𝑦1 = 0.62 and 𝑃 𝑦2 = 0.38
Conditional Entropy
• Let the symbol 𝑦𝑗 is observed at destination point.

• The symbol is caused due to transmission of any symbol 𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑚 from the input source alphabet.

• Conditional entropy 𝐻 𝑋Τ𝑦𝑗 is the uncertainty in transmitting the information source 𝑋 after observing
the output 𝑦𝑗 .

𝑚 𝑚

𝐻 𝑋Τ𝑦𝑗 = −𝑃 𝑦𝑗 ෍ 𝑃 𝑥𝑖 Τ𝑦𝑗 log 2 𝑃 𝑥𝑖 Τ𝑦𝑗 = − ෍ 𝑃 𝑥𝑖 𝑦𝑗 log 2 𝑃 𝑥𝑖 Τ𝑦𝑗


𝑖=1 𝑖=1

• 𝐻 𝑋Τ𝑌 is the uncertainty about the channel input after the channel output is observed.

𝑛 𝑛 𝑚

𝐻 𝑌Τ𝑋 = ෍ 𝐻 𝑋Τ𝑦𝑗 = − ෍ ෍ 𝑃 𝑥𝑖 𝑦𝑗 log 2 𝑃 𝑥𝑖 Τ𝑦𝑗


𝑗=1 𝑗=𝑖 𝑖=1
Mutual Information
• It is the difference of uncertainty of the transmitting symbol 𝑋 before and after
observing the channel output.

• 𝐻 𝑋 : Uncertainty about the channel input before the channel output is observed

• 𝐻 𝑌Τ𝑋 : Uncertainty about the channel input after the channel output is observed

𝐼 𝑋; 𝑌 = 𝐻 𝑋 − 𝐻 𝑌Τ𝑋

• It is the uncertainty about the transmitted symbol 𝑿 is being resolved by observing


the received symbol 𝒀.
Channel Capacity
• It is defined as the maximum value of the mutual information.

𝐶𝑠 = max 𝐼 𝑋; 𝑌 bits/symbol
𝑝(𝑋)

• Channel Capacity pe second

𝐶 = 𝑟𝐶𝑠 bits/sec

• Channel-coding theorem

The maximum rate (𝑅) at which communication is established over a discreate memoryless
channel with arbitrary small probability of error is

𝑅≤𝐶
Shannon’s formula for channel capacity in AWGN channel
• Ex: Consider an AWGN channel with 4 kHz BW and noise power spectral
𝜂
density = 10−2 𝑊𝑎𝑡𝑡/𝐻𝑧. It the signal power at the receiver is 0.1mW, find
2
the maximum rate at which information can be transmitted with arbitrary
small probability of error.
• Ans:
Signal Power: 𝑆 = 0.1 × 10−3 = 10−4 𝑊
Channel Bandwidth: 𝐵 = 4000𝐻𝑧
Noise Power: 𝑁 = 𝜂𝐵 = 2 × 10−12 × 4000 = 8 × 10−9 𝑊
𝑆 10−4
SNR: = = 1.25 × 104
𝑁 8×10−9
𝑆
Channel Capacity: 𝐶 = 𝐵 log 2 1 +
𝑁
= 4000 × log 2 1 + 1.25 × 104 = 54.44𝑘𝑏𝑝𝑠
Bandwidth- SNR Tradeoff
𝑆 𝑆
𝐶 = 𝐵 log 2 1+ = 𝐵 log 2 1 +
𝑁 𝜂𝐵

𝜂𝐵
𝑆𝜂 𝑆 𝑆 𝑆 𝑆
= 𝐵 log 2 1+ = log 2 1 +
𝜂𝑆 𝜂𝐵 𝜂 𝜂𝐵

Channel capacity of a channel with infinity bandwidth:

𝜂𝐵
𝑆 𝑆 𝑆
𝐶∞ = lim 𝐶 = lim log 2 1 +
𝐵→∞ 𝜂 𝐵→∞ 𝜂𝐵
Forward Error Correction (FEC) Code
• Some redundant data is added with original information at transmitting end.
• It is used at receiver to detect and correct the errors in the received information.
• Receiver need not to ask any additional information from the transmitter.
• Code words: It is a unit of bits that can be decoded independently.
• The number of bits in a codeword is code length.
• If k-bits of data digit s are transmitted by code word of n- bit digits, the number of
redundant (check) bits:

𝑚 =𝑛−𝑘
𝑛
• Code rate: 𝑅 =
𝑘

• It is also called as (𝒏, 𝒌) code.


• So, data vector ( 𝑘 dimensional):

𝐝 = 𝑑1 , 𝑑2 , 𝑑3 , … 𝑑𝑘
• And, code vector ( 𝑘 dimensional):

𝐜 = 𝑐1 , 𝑐2 , 𝑐3 , … 𝑐𝑛
Types of FEC codes
1. Block Codes 2. Convolutional Codes

• A block of k bits needs to accumulate • Coded sequence on n bits depends on k


and then codded to n bits (n > k) bits data digits as well as previous N data
digits
• Unique sequence of k bits generates
unique code, not depends on • Coded data is not unique. Coding is done
previous values on continuously running basis rather
than by blocks of k bits data digits.
Liner Block Codes
• All the n – digits codewords are formed by linear combination of k – data
digits.
• Systematic code: Leading k – digits of the codeword are data/information
digits and remaining 𝑚 = 𝑛 − 𝑘 digits are parity-check digits
• Parity-check digits are formed by linear combination of data digits
𝑑1 , 𝑑2 , 𝑑3 , … 𝑑𝑘
• Hence, the code:
𝑐1 = 𝑑1
𝑐2 = 𝑑2

.
.
𝑐𝑘 = 𝑑𝑘
𝑐𝑘+1 = ℎ11 𝑑1 + ℎ12 𝑑2 + ℎ13 𝑑3 + ⋯ +ℎ1𝑘 𝑑𝑘
𝑐𝑘+2 = ℎ21 𝑑1 + ℎ22 𝑑2 + ℎ23 𝑑3 + ⋯ +ℎ2𝑘 𝑑𝑘

.
.
𝑐𝑛 = 𝑐𝑘+𝑚 = ℎ𝑚1 𝑑1 + ℎ𝑚2 𝑑2 + ℎ𝑚3 𝑑3 + ⋯ +ℎ𝑚𝑘 𝑑𝑘
• So, 𝐜 = 𝐝𝐆
• 𝐆: Generator Matrix (𝑘 × 𝑛)

1 0… 0 ℎ11 ℎ21 … ℎ𝑚1


𝐆 = 0 1… 0 ℎ12 ℎ22 … ℎ𝑚2
0 0… 1 ℎ1𝑘 ℎ2𝑘 … ℎ𝑚𝑘

• 𝐆 can be partitioned into a (𝑘 × 𝑛) Identity matrix 𝐈𝑘 and a (𝑘 × 𝑚) matrix P


• The elements of P are either 1 or 0.

𝐜 = 𝐝𝐆
= 𝐝 𝐈𝑘 𝐏
= [𝐝 𝐝𝐏]
Example: (6,3) block code

In calculation use
modulo-2 adder:

1+1=0
1+0=1
0+1=1
0+0=0
Find all codewords for all possible data words
How many bits of error it can detect?

Data word Code word


111 111000 Minimum Hamming distance ℎ𝑚𝑖𝑛 = 3
110 110110
ℎ𝑚𝑖𝑛 −1
101 101011 So 𝑑 = =1
2
100 100101
011 011101 Hence, the code can detect single bit error
010 010010
001 001110
000 000000
Cyclic Codes:
• It is a subclass of linear block codes.
• It is capable of correcting more than one error in the codeword.
• Its implementation is simple by use of shift registers.
• The codewords are simple lateral shift of one another.
• Ex: If 𝒄 = (𝑐1 , 𝑐2 , 𝑐3 , … . , 𝑐𝑛−1 , 𝑐𝑛 ) is a codeword, then
(𝑐2 , 𝑐3 , … . , 𝑐𝑛−1 , 𝑐𝑛 , 𝑐1 ), (𝑐3 , 𝑐4 , … . , 𝑐𝑛−1 , 𝑐𝑛 , 𝑐1 , 𝑐2 ) and so on are also
codewords.

• Let 𝒄(𝑖) denotes 𝒄 shifted cyclically 𝑖 places to the left.

𝒄(𝑖) = (𝑐𝑖+1 , 𝑐𝑖+2 , 𝑐𝑖+3 , … . , 𝑐𝑛 , 𝑐1 , 𝑐2 , … 𝑐𝑖 )


Polynomial representation of cyclic codes:
• Cyclic codes can be represented in a polynomial form.

• For 𝒄 = (𝑐1 , 𝑐2 , 𝑐3 , … . , 𝑐𝑛−1 , 𝑐𝑛 ), polynomial is:

𝒄 𝑥 = 𝑐1 𝑥 𝑛−1 + 𝑐2 𝑥 𝑛−2 + 𝑐3 𝑥 𝑛−3 + ⋯ + 𝑐𝑛−1 𝑥 1 + 𝑐𝑛 𝑥 0

• For 𝒄(𝑖) = (𝑐𝑖+1 , 𝑐𝑖+2 , 𝑐𝑖+3 , … . , 𝑐𝑛 , 𝑐1 , 𝑐2 , … 𝑐𝑖 ), polynomial is:

𝒄 𝑖 (𝑥) = 𝑐𝑖+1 𝑥 𝑛−1 + 𝑐𝑖+2 𝑥 𝑛−2 + 𝑐𝑖+3 𝑥 𝑛−3 + ⋯ + 𝑐𝑖−1 𝑥 1 + 𝑐𝑖 𝑥 0

• Coefficients of the polynomials are either 1 or 0


• Consider a (𝑛, 𝑘) codeword.

• Length of the codeword: 𝑛

• Length of the data word: 𝑘

• Data polynomial: 𝒅 𝑥 = 𝑑1 𝑥 𝑘−1 + 𝑑2 𝑥 𝑘−2 + 𝑑3 𝑥 𝑘−3 + ⋯ + 𝑑𝑘−1 𝑥 1 + 𝑑𝑘 𝑥 0

• Length of the redundant word: 𝑛 − 𝑘

• Generator Polynomial: It is a polynomial 𝒈 𝑥 of degree (𝑛 − 𝑘) used to convert the


data polynomial to code polynomial.

𝒄 𝑥 =𝒅 𝑥 𝒈 𝑥
• Example: Find generator polynomial 𝒈 𝑥 for a (7,4) cyclic code. Find codewords
for following data words: 1010, 1111, 0001, 1000.

• Ans:

𝑛=7

𝑘=4 𝑛−𝑘 =3

So, the generator polynomial 𝒈 𝑥 must be order of 3.

(𝑥 7 +1) = (𝑥 + 1)(𝑥 3 + 𝑥 2 + 1)(𝑥 3 + 𝑥 + 1)

So, there are two possible generator polynomials: 𝑥 3 + 𝑥 2 + 1 , (𝑥 3 + 𝑥 + 1)


Let the generator polynomial 𝒈 𝑥 = 𝑥 3 + 𝑥 2 + 1

Data: 1010. So 𝒅 𝑥 = 1. 𝑥 3 + 0. 𝑥 2 + 1. 𝑥 + 0.1 = 𝑥 3 + 𝑥

Code polynomial: 𝒄 𝑥 = 𝒅 𝑥 𝒈 𝑥 = 𝑥 3 + 𝑥 𝑥 3 + 𝑥 2 + 1 = 𝑥 6 + 𝑥 5 +
𝑥4 + 𝑥

So, codeword: 1110010

Similarly, for d = 1111, c = 1001011


d = 0001, c = 0001101
d = 1000, c = 1101000

You might also like