0% found this document useful (0 votes)
2 views

capacity+coding

Uploaded by

reubenvundi2001
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

capacity+coding

Uploaded by

reubenvundi2001
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Continuous distributions

• Continuous distributions
• differential entropy
• relative entropy
• Conditional differential entropy
• Mutual information
• Information capacity law
• Rate distortion theory

Channel Coding

12/10/2024 1
Differential entropy

• ℎ(𝑋) is called the differential entropy of 𝑋


• ℎ(𝑋) is a useful mathematical quantity, but it does not measure the randomness of 𝑋
Why?

12/10/2024 2
Differential entropy
• ℎ(𝑋) is a useful mathematical quantity, but it does not measure the randomness of 𝑋
Why?
First, consider the continuous random variable 𝑋 as a limiting form of a discrete random variables that
assumes values 𝑥𝑘 = 𝑘Δ𝑥 for 𝑘 = 0, ±1, ±2, … and associated probabilities 𝑓𝑋 𝑥𝑘 Δ𝑥. Therefore

12/10/2024 =ℎ 𝑋 −∞ 3
Differential entropy
• ℎ(𝑋) is a useful mathematical quantity, but it does not measure the randomness of X
Why?
First, consider the continuous random variable X as a limiting form of a discrete random variables that
assumes values 𝑥𝑘 = 𝑘Δ𝑥 for 𝑘 = 0, ±1, ±2, … and associated probabilities 𝑓𝑋 𝑥𝑘 Δ𝑥. Therefore
𝐻(𝑋) = ℎ 𝑋 − ∞

• To avoid the problems associated with −log 2 Δ𝑥, only the differential entropy ℎ(𝑋) is adopted, with
the log 2 Δ𝑥 taken as a reference.
• This is a convenient choice since if we are interested in the information transmitted over a channel,
we get the difference between two entropy terms, and therefore the reference term would cancel
each other out.
12/10/2024 4
Differential entropy

12/10/2024 5
Differential entropy – example: uniform
distribution

12/10/2024 6
Relative entropy


1
ℎ 𝑌 ≤ න 𝑓𝑌 𝑥 log 2 d𝑥
−∞ 𝑓𝑋 𝑥
12/10/2024 7
Entropy of a Gaussian distribution

and

12/10/2024 8
Entropy of a Gaussian distribution
where

12/10/2024 9
Mutual information
Recall: and 𝐻 𝑋𝑌

Continuous random variables random variables 𝑋 and 𝑌 with relative distribution 𝑓𝑋 (𝑥|𝑦), marginal distribution
𝑓𝑋 (𝑥) and joint distribution 𝑓𝑋,𝑌 (𝑥, 𝑦)

Conditional differential entropy

why

12/10/2024 10
Information capacity law
For a band-limited, power-limited Gaussian channel

Power limited
Additive white gaussian noise (AWGN)
Zero mean, band-limited to 𝐵 Hz
𝑁
Noise samples 𝑁𝑘 ~ 0, 𝜎 2 , 𝜎 2 = 𝑁0 𝐵 Power spectral density 𝑃𝑆𝐷 = 20 W/Hz

A model of a discrete-time Gaussian channel

Consider a zero-mean stationary process 𝑋(𝑡) band-limited to 𝐵 Hz


• If the process is uniformly sampled, a minimum rate 2𝐵 samples per second required to not lose information
• Let 𝑋𝑘 , 𝑘 = 1, 2, 3, … , 𝐾 be the continuous random variables obtained by the uniform sampling of 𝑋(𝑡)
• If the 𝐾 samples are transmitted in 𝑇 seconds, then 𝐾 = 2𝐵𝑇
12/10/2024 11
Information capacity law
Information Capacity of a channel
• Maximum of mutual information between the channel input 𝑋𝑘 and channel output 𝑌𝑘 over all the distributions
of the input 𝑋𝑘 that satisfy the power constraint

where

and since and with 𝑋𝑘 and 𝑁𝑘 independent ℎ 𝑌𝑘 |𝑋𝑘 = ℎ 𝑁𝑘 (remaining uncertainty


about 𝑌𝑘 after 𝑋𝑘 is observed)

leading to:

12/10/2024 12
Information capacity law
Information Capacity of a channel
• Maximum of mutual information between the channel input 𝑋𝑘 and channel output 𝑌𝑘 over all the distributions
of the input 𝑋𝑘 that satisfy the power constraint

where => recall:

and since and with 𝑋𝑘 and 𝑁𝑘 independent ℎ 𝑌𝑘 |𝑋𝑘 = ℎ 𝑁𝑘 (remaining uncertainty


about 𝑌𝑘 after 𝑋𝑘 is observed)

leading to: Since Gaussian distribution of 𝑋𝑘 gives maximum ℎ 𝑋𝑘

12/10/2024 13
Information capacity law
Information capacity evaluation stages (3)

12/10/2024 14
Information capacity law
Information capacity evaluation stages (3)

Recall that the process 𝑋(𝑡) is sampled 𝐾 = 2𝐵𝑇 times (considering the Nyquist rate 2𝐵 for a signal that is
band-limited by 𝐵 and the duration required to transmit the samples, 𝑇).

Now, to transmit all the 𝐾 samples, we need 𝐾 channel uses. We can also replace noise variance by 𝜎 2 = 𝑁0 𝐵.
Therefore, the channel capacity in bits/second is given by

12/10/2024 15
Information capacity law
Information capacity law (fundamental limit)

The information capacity of a continuous channel of bandwidth 𝐵 Hz, perturbed by AWGN of power
spectral density 𝑁0 /2 (double sided) and limited in bandwidth to 𝐵, is given by the formula

where P is the average transmitted power

Therefore, the channel capacity is a function of the


- channel bandwidth, 𝐵 (linear dependence)
- average transmitted power, 𝑃 (logarithmic dependence)
- channel’s noise power spectral density, 𝑁0 /2 (logarithmic dependence)

Thus, expanding the bandwidth of a continuous communication channel increases the capacity
faster than increasing the average transmission power does.
12/10/2024 16
Information capacity law
Information capacity law (fundamental limit)

From

𝑃
is also called the signal-to-noise (power) ratio (SNR), therefore, the information capacity law can
𝑁0 𝐵
also be written
𝐶 = 𝐵 log 2 (1 + 𝑆𝑁𝑅) bits per second

Remember, for you approach the limit set by the information capacity law, the statistical properties of
the transmitted signal must approximate those of white Gaussian noise.

12/10/2024 17
Information capacity law
Example - Sphere packing
• Codeword length: 𝑛
• Average power per bit: 𝑃
• Total codeword power: 𝑛𝑃
• Noise variance per received bit: 𝜎 2
• Total codeword variance: 𝑛𝜎 2

• Therefore, it is highly probable that received codeword vector lies


inside a sphere of radius 𝜎 𝑛, centered on transmitted codeword.

• Total average power of received codeword: 𝑛𝑃 + 𝑛𝜎 2


• A high probability that all codewords fall within the radius:
𝑛 𝑃 + 𝜎2

12/10/2024 18
Information capacity law
Sphere packing

where 𝐴𝑛 is a scaling factor

Then, the maximum number of non-intersecting decoding spheres are:

𝑃 𝑛/2 1 𝑃
In bits: log 2 1+ 2 = 𝑛 × log 2 1 + , for transmitting an n-bit channel code with a low
𝜎 2 𝜎2
probability of error
12/10/2024 19
Information capacity law – Implications
Bandwidth efficiency
Transmitted signal energy per bit, 𝐸𝑏
Noise power spectral density (single sided), 𝑁0 (units: power/Hz -> energy)
Average signal power 𝑃 = 𝐸𝑏 𝑅𝑏
For an ideal system, 𝑅𝑏 = 𝐶 bits/s

𝐶
𝐸𝑏 2 −1
𝐵
Therefore, for an ideal system, = 𝐶
𝑁0
𝐵

𝑅𝑏
𝐸𝑏 2 𝐵 −1
For a general system, = 𝑅𝑏
𝑁0
𝐵

𝑅𝑏 𝐸𝑏
A plot of vs is called a bandwidth efficiency diagram.
𝐵 𝑁0
12/10/2024 20
Information capacity law – Implications
Bandwidth efficiency diagram
1. Infinite bandwidth

Increasing 𝑅𝑏 /𝐵 by increases 𝑃𝑒 for a


a. Shannon limit

fixed 𝐸𝑏 /𝑁0
b. Capacity

2. Capacity boundary
Curve given by the critical bit rate, 𝑅𝑏 = 𝐶

3. Potential trade-offs Reducing 𝑃𝑒 by increasing 𝐸𝑏 /𝑁0 for a fixed 𝑅𝑏 /𝐵


a. 𝐸𝑏12/10/2024
/𝑁𝑜 , b. 𝑅𝑏 /𝐵 c. Probability of error, 𝑃𝑒 21
Rate Distortion Theory
In many cases

• Lossy data compression is done at the source because the information source cannot be fully
represented with the available alphabet, e.g.
• for a continuous source that has to be quantized, and
• For a discrete source coded at an average code length lower than entropy
• Information transmission is done at a rate greater than channel capacity

The branch of information theory that deals with such cases is called rate distortion
theory.

12/10/2024 22
Rate Distortion Theory
Rate distortion function

• Consider a DMS with alphabet with symbol probabilities


• Average rate of 𝑅 bits/codeword and the codewords are drawn from the alphabet
• If 𝑅 ≥ 𝐻 (where 𝐻 is the entropy of the source) then the code alphabet does not lead to loss of
information
• However, if 𝑅 < 𝐻, then there is unavoidable distortion and information loss
• Let

• Let 𝑑 𝑥𝑖 , 𝑦𝑗 denote a measure of the cost incurred in representing the source symbol 𝑥𝑖 by the
symbol 𝑦𝑗

• 𝑑 𝑥𝑖 , 𝑦𝑗 is called a single-letter distortion measure

12/10/2024 23
Rate Distortion Theory
Rate distortion function

• For an entire source alphabet, a non-negative average distortion 𝑑ҧ (a continuous function of

𝑝 𝑦𝑗 𝑥𝑖 ), is given by

• The conditional probability assignment 𝑝 𝑦𝑗 𝑥𝑖 is said to be 𝐷-admissible iff 𝑑 ≤ 𝐷, where 𝐷 is


some acceptable value of average distortion.
• The set of all 𝐷-admissible conditional probability assignments is given by
• The associated mutual information is given by

12/10/2024 24
Rate Distortion Theory
Rate distortion function

• A rate distortion function, 𝑅(𝐷), is the smallest coding rate possible for which the average
distortion is guaranteed to not exceed 𝐷
• Mathematically:

• The units of 𝑅(𝐷) is bits if 𝐼 𝑋; 𝑌 is also measured in bits (by using log 2 ⋅ ).
• From the definition of 𝑅(𝐷)
• If a large distortion 𝐷 can be tolerated, a smaller rate 𝑅 can be used for coding or
12/10/2024 transmission. 25
Rate Distortion Theory
Rate distortion theorem

• The rate-distortion function 𝑅(𝐷) gives the minimum achievable rate at distortion level 𝐷.

Summary diagram

12/10/2024 26
Assignment 2 (Do in groups of 2)
Question 1: Rate distortion of Gaussian source (5 marks)

• Using the square error distortion, 𝑑 𝑥, 𝑦 = 𝑥 − 𝑦 2 , show that 𝑅(𝐷) for a source 𝑋 that is
normally distributed with zero mean and variance 𝜎 2 is given by

Notes:
• First find the lower limit of mutual information for a given distortion average distortion, 𝐷
• Then show that there is a conditional density function that gives that lower bound of mutual information
• Proofs are available in online sources and other books. However, you are required to explain how each of
the steps come about.
12/10/2024 27
Assignment 2 (Do in groups of 2)
Question 2: Rate distortion function and coding (5 marks)

a) Compare and contrast rate distortion theorem and source & channel coding theorems.
b) Describe any four applications of lossy compression.

12/10/2024 28
Channel Coding
• Also called error control coding
• Used to overcome the effects of noise/interference in the channel
• Adds an amount of redundancy (in a known manner) onto the information prior to transmission
• At the receiver, the correct transmitted message is recovered if the errors are within a
correctable limit

12/10/2024 29
Channel Coding
• Also called error control coding
• Used to overcome the effects of noise/interference in the channel
• Adds an amount of redundancy (in a known manner) onto the information prior to transmission
• At the receiver, the correct transmitted message is recovered if the errors are within a
correctable limit

12/10/2024 30
Channel Coding
• Simplified digital communication system (source encoding and decoding not shown)

12/10/2024 31
Channel Coding – Errors
Types of errors in communication systems

a) Single bit errors


b) Multi-bit errors

Error detection

a) Vertical redundancy check (VRC)


b) Checksum
c) Longitudinal redundancy check (LRC)
d) Cyclic redundancy check (CRC)

12/10/2024 32
Channel Coding – Errors
Types of errors in communication systems

a) Single bit errors


b) Multi-bit errors

Error detection Error correction

a) Vertical redundancy check (VRC) a) Automatic repeat request (ARQ)


b) Checksum b) Forward error correction
c) Longitudinal redundancy check (LRC) • repetition code,
d) Cyclic redundancy check (CRC) • linear block codes,
• Convolution codes,
• Turbo codes, etc
12/10/2024 33
Channel Coding – Error detection
Vertical redundancy check Checksum

• Parity bit appended to each character • A character representing the numeric sum
• #parity bits proportional to number of of all characters in the message
characters in the message • Appended at the end of the message
• Parity can be odd or even • Receiver checksum determines error

Longitudinal redundancy check Cyclic redundancy check

• Parity bits appended to each message • Also called (𝑛, 𝑘) cyclic codes
• Also called 2D parity check • 𝑛 bits transmitted, 𝑘 bits for information
• Parity can be odd or even • A generator polynomial used to generate
• Receiver computes parity to detect errors redundancy bits and to check for errors
12/10/2024 34
Channel Coding – Error detection
LRC illustration (transmitter)

From here
12/10/2024 35
Channel Coding – Error detection
LRC illustration - receiver

From here
12/10/2024 36
Channel Coding – Error correction
ARQ FEC

• If the message is received in error, the • Redundancy bits added before


receiver requests the transmitter to resend transmission
the whole message or part of it • After detection of error, redundancy bits
• There are two types of ARQ used to identify error location
• Stop and wait ARQ - send a packet and • Typically used to correct single bit errors,
wait but can go up to 3
• Go back N ARQ – packets in window • Reason: number of redundancy bits
size 𝑁 retransmitted if some are lost required to detect multiple bit errors is
• Selective ARQ – only lost packets in a large
window retransmitted
12/10/2024 37

You might also like