Lecture 3 Channel Capacity
Lecture 3 Channel Capacity
Wireless Channel
Capacity
3
Some Channel Models
Binary Z-Channel
Binary Erasure Channel Binary Errors with Erasure Channel
4
Channel Capacity
Shannon’s measure of information is the number of bits per symbol to represent the
amount of uncertainty/information/randomness in a data source, and is defined as
entropy.
H X E log 2 p x xX p x log 2 p x
Example: Calculate the entropy H(X) of a binary source and plot it as a function of p(x=1) = p0.
p(x=0)
= 1 - p0
H(X) =
- (1-p0)log2(1-p0) - p0log2p0
p(x=1)
= p0 5
Channel Capacity
Conditional Entropy: Quantifies the amount of information needed to
describe the outcome of a random variable given that the value of another random
variable is known (i.e., how much entropy of a random variable is remaining) .
7
Channel Capacity
Channel capacity per symbol (one sample of X): Maximum
information conveyed over all possible input probability
distributions.
8
Capacity: Example 1
The joint probability of a system is given by
y1 y2
Find:
1. Marginal entropies
2. Joint entropy
3. Conditional entropies
4. Mutual information
5. Channel capacity
6. Draw the channel model.
9
Capacity: Example 1
Solution:
y1 y2
10
Capacity: Example 2
Example: Calculate the capacity of the following BSC channel given
p(x = 1) = p0.
OR
Then maximizing input distribution is Gaussian, which results in the channel capacity:
P: Transmit power
B: Channel bandwidth
N0: Noise PSD
γ: SNR
12
AWGN Channel Capacity: Example
Example 4.1: Consider a wireless channel where power falloff with distance
follows the formula Pr(d) = Pt(d0/d)3 for d0 = 10 m. Assume the channel has
bandwidth B = 30 KHz and AWGN with noise power spectral density of N0 = 10−9
W/Hz. For a transmit power of 1W, find the capacity of this channel for a
transmit-receive distance of 100 m and 1 Km.
Solution:
The received SNR:
γ = Pr(d)/(N0B) = 0.13/(10−9 × 30 × 103) = 33 = 15 dB for d = 100 m
γ = 0.013/(10−9 × 30 × 103) = 0.033 = −15 dB for d = 1000 m
Note the significant decrease in capacity at farther distances, due to the path-loss
exponent of 3, which greatly reduces received power as distance increases.
13
Flat Fading Channel Capacity
• An input message w is encoded into the codeword x, which is transmitted over the
time-varying channel as x[i] at time i. The added noise n[i] is AWGN.
• Assume a discrete-time channel with stationary and ergodic time-varying gain
• The channel power gain g[i], also called the channel state information (CSI),
follows a given distribution p(g), e.g., for Rayleigh fading, p(g) is exponential.
• We assume that g[i] is independent of the channel input x[i].
• g[i] can change at each time i, either as an i.i.d. process or with some correlation
over time.
• In a block fading channel, g[i] is constant over some block length T after which time
g[i] changes to a new independent value based on the distribution p(g). 14
Flat Fading Channel Capacity
Capacity of this channel depends on what is known about g[i] at the transmitter
and receiver. We consider three different scenarios:
1. Channel Distribution Information (CDI): The distribution of g[i] is known to both
the transmitter and receiver.
2. Receiver CSI: The value of g[i] is known at the receiver at time i, and both the
transmitter and receiver know the distribution of g[i].
3. Transmitter and Receiver CSI: The value of g[i] is known at the transmitter and
receiver at time i, and both the transmitter and receiver know the distribution of g[i].
Transmitter and receiver CSI allow the transmitter to adapt both its power and rate to
the channel gain at time i, and leads to the highest capacity of the three scenarios.
Note that since the instantaneous SNR γ[i] is just g[i] multiplied by the constant ,
known CSI or CDI about g[i] yields the same information about γ[i]. 15
Case 1: CDI known to both TX and RX
Consider the case where the channel gain distribution p(g) or,
equivalently, the distribution of SNR p(γ) is known to the transmitter
and receiver.
But, solving for the capacity-achieving input distribution p(x), i.e., the
distribution achieving the maximum of I(X;Y), can be quite complicated
depending on the fading distribution.
17
Case 2A: Shannon (Ergodic) Capacity
For the AWGN channel, Shannon capacity defines the maximum data rate that
can be sent over the channel with asymptotically small error probability.
Since only the receiver knows the instantaneous SNR γ[i], the transmitter cannot
adapt its transmission strategy relative to the CSI.
Shannon capacity of a fading channel with receiver CSI for an average power
constraint can be obtained as
E[f(X)]≤ ( [ ]) if f is concave
Thus the Shannon capacity of a fading channel with receiver CSI only is less
than the Shannon capacity of an AWGN channel with the same average SNR.
In other words, fading reduces Shannon capacity when only the receiver has
CSI.
J1 Jannat, 8/26/2020
Case 2A: Shannon (Ergodic) Capacity
Example 4.2: Consider a flat-fading channel with i.i.d. channel gain g[i] which can take on
three possible values: g1 = 0.05 with probability p1 = 0.1, g2 = 0.5 with probability p2 = 0.5,
and g3 = 1 with probability p3 = 0.4. The transmit power is 10 mW, the noise spectral density
is N0 = 10−9 W/Hz, and the channel bandwidth is 30 KHz. Assume the receiver has
knowledge of the instantaneous value of g[i], but the transmitter does not. Find the Shanon
capacity of this channel and compare with the capacity of an AWGN channel with the same
average SNR.
Solution: The channel has three possible received SNRs, γ1 = Ptg1/(N0B) = 0.01∗(0.052)/
(30000∗10−9) = 0.8333 = −0.79 dB, γ2 = Ptg2/(N0B) = 0.01 × (0.52)/(30000 ∗ 10−9) = 83.333 =
19.2 dB, and γ3 = Ptg3/(N0B) = 0.01/(30000 ∗ 10−9) = 333.33 = 25 dB.
The probabilities associated with each of these SNR values is p(γ1) = 0.1, p(γ2) = 0.5, and p(γ3) =
0.4. Thus, the Shannon capacity is given by
The average SNR for this channel is γ = 0.1(0.8333) + 0.5(83.33) + 0.4(333.33) = 175.08 = 22.43
dB. The capacity of an AWGN channel with this SNR is C = B log2(1 + 175.08) = 223.8 Kbps.
Note that this rate is about 25 Kbps larger than that of the flat-fading channel with receiver
CSI and the same average SNR. 20
Case 2B: Capacity with Outage
Capacity with outage is defined as the maximum rate that
can be transmitted over a channel with some outage probability
Outage probability is the probability that the transmission
cannot be decoded with negligible error probability.
The basic premise of capacity with outage is that a high
data rate can be sent over the channel and decoded correctly
except when the channel is in deep fading.
By allowing the system to lose some data in the event of
deep fades, a higher data rate can be maintained if all data
must be received correctly regardless of the fading state.
With this model, if the channel has received SNR γ during a
burst then data can be sent over the channel at rate Blog2(1 + γ)
with negligible probability of error.
21
Case 2B: Capacity with Outage
The transmitter fixes a minimum received SNR γmin and
encodes for a data rate C = Blog2(1 + γmin).
The data is correctly received if the instantaneous received
SNR ≥ γmin
If the received SNR < γmin, then the bits received over that
transmission burst cannot be decoded correctly
Probability of outage: pout = p(γ < γmin)
The average rate correctly received over many transmission
bursts is Co = (1 − pout) Blog2(1 + γmin) since data is only correctly
received on (1 − pout) transmissions.
The value of γmin is a design parameter based on the acceptable
outage probability.
22
Case 2B: Capacity with Outage
Capacity with outage is typically characterized by a plot of capacity versus outage. In this
figure, normalized capacity C/B = log2(1 + γmin) is plotted as a function pout = p(γ < γmin) for
a Rayleigh fading channel (γ exponential) with average γ = 20 dB.
We see that capacity approaches zero for small outage probability, due to the requirement
to correctly decode bits transmitted under severe fading, and increases dramatically as
outage probability increases. However, these high capacity values for large outage
probabilities have higher probability of incorrect data reception.
Hints:
Rayleigh fading channel
(γ exponential)
23
Case 2B: Capacity with Outage
Example 4.3: Consider a flat-fading channel with i.i.d. channel gain g[i] which can take on
three possible values: g1 = 0.05 with probability p1 = 0.1, g2 = 0.5 with probability p2 = 0.5,
and g3 = 1 with probability p3 = 0.4. The transmit power is 10 mW, the noise spectral density
is N0 = 10−9 W/Hz, and the channel bandwidth is 30 KHz. Assume the receiver has
knowledge of the instantaneous value of g[i], but the transmitter does not. Find the capacity
versus outage for this channel, and find the average rate correctly received for outage
probabilities pout < 0.1, pout = 0.1 and pout = 0.6.
Capacity versus outage (pout = p(γ < γmin): F(γ) = p(γ≤ γ0)
0.6
p(γ)
0.5
0.4
0.1 0.1
γ1 = γ2 = γ3 = γ γ1 = γ2 = γ3 = γ
0.8333 83.33 333.33 0.8333 83.33 333.33 24
Case 2B: Capacity with Outage
Capacity versus outage:
For time-varying channels with discrete SNR values, the capacity versus outage is a staircase
function.
For pout < 0.1, minimum received SNR for pout in this range of values is the weakest channel: γmin
= γ1, and the corresponding capacity is C = Blog2(1 + γmin) = 30000 log2(1.833) = 26.23 kbps.
For 0.1 ≤ pout < 0.6, γmin = γ2 and C = Blog2(1 + γmin) = 30000 log2(84.33) = 191.94 kbps.
For 0.6 ≤ pout < 1, γmin = γ3 and the C = B log2(1 + γmin) = 30000 log2(334.33) = 251.55 kbps.
251.55
C (kbps)
191.94
26.23
pout
0.1 0.6 1 25
Case 2B: Capacity with Outage
Average rate correctly received:
For pout <0.1, data transmitted at rates close to capacity C0 = C = 26.23 kbps are always
correctly received since the channel can always support this data rate.
For pout = 0.1, we transmit at rates close to C = 191.94 kbps, but we can only correctly decode
these data when the channel SNR is γ2 or γ3, so the rate correctly received is C0 = (1 − 0.1)
*191940 = 172.75 kbps.
For pout = 0.6, we transmit at rates close to C = 251.55 kbps, but we can only correctly decode
these data when the channel SNR is γ3, so the rate correctly received is C0 = (1−0.6)*251550 =
100.62 kbps.
It is likely that a good engineering design for this channel would send data at a rate close to
191.94 Kbps, since it would only be received incorrectly at most 10% of this time and the
data rate would be almost an order of magnitude higher than sending at a rate
commensurate with the worst-case channel capacity.
However, 10% retransmission probability is too high for some applications, in which case
the system would be designed for the 26.23 Kbps data rate with no retransmissions.
26
Case 3: CDI & CSI both known to both TX & RX
When both the transmitter and receiver have CSI, the transmitter can
adapt its transmission strategy relative to this CSI.
In this case, there is no notion of capacity versus outage where the
transmitter sends bits that cannot be decoded, since the transmitter knows the
channel and thus will not send bits unless they can be decoded correctly.
We will derive Shannon capacity assuming optimal power and rate
adaptation relative to the CSI.
There are alternate capacity definitions, and their power and rate adaptation
strategies. 27
Case 3: Shannon Capacity (CDI and CSI both known to both TX and RX)
Consider the Shannon capacity when the channel power gain g[i] is known to both the
transmitter and receiver at time i.
Let s[i] be a stationary and ergodic stochastic process representing the channel state,
which takes values on a finite set S of discrete memoryless channels.
Let Cs denotes the capacity of a particular channel s ∈S and p(s) denote the probability,
or fraction of time, that the channel is in state s.
The capacity of this time-varying channel is then given by
Let p(γ) = p(γ[i] = γ) denote the pdf of the received SNR γ. Then the capacity of the
fading channel with transmitter and receiver CSI with no power adaptation is
(A)
28
Case 3: Shannon Capacity (CDI and CSI both known to both TX and RX)
With this additional constraint, we cannot apply previous equation (A) directly to obtain
the capacity. However, we expect that the capacity with this power constraint will be
the average capacity given by (A) with the power optimally distributed over time. Thus,
we can define the fading channel capacity with the above average power constraint as:
(B)
It is proved that this capacity can be achieved, and any rate larger than this capacity
has probability of error bounded away from zero.
29
Case 3: Shannon Capacity (CDI and CSI both known to both TX and RX)
How to achieve the above capacity in (B)?
The main idea is a “time diversity” system with multiplexed input and demultiplexed output.
We first quantize the range of fading (SNR) values to a finite set {γj : 1 ≤ j ≤ N}. For each γj, we
design an encoder/decoder pair for an AWGN channel with SNR γj .
The input xj for encoder γj has average power P(γj) and data rate Rj = Cj , where Cj is the capacity
of a time-invariant AWGN channel with received SNR P(γj)γj/P.
These encoder/decoder pairs correspond to a set of input and output ports associated with each γj .
When γ[i] ≈ γj , the corresponding pair of ports are connected through the channel. 30
Case 3: Shannon Capacity (CDI and CSI both known to both TX and RX)
When γ[i] ≈ γj , the corresponding pair of ports are connected through the channel.
The codewords associated with each γj are thus multiplexed together for transmission, and
demultiplexed at the channel output.
The average rate on the channel given by (B) is just the sum of rates associated with each of the
γj channels weighted by p(γj), the percentage of time that the channel SNR equals γj. 31
Case 3: Shannon Capacity (CDI and CSI both known to both TX and RX)
With constraint P(γ) > 0, the optimal power adaptation that maximizes capacity (B):
Optimal power allocation policy only depends on the fading distribution p(γ) through
the cutoff value γ0.
If γ[i] is below this cutoff, then no data is transmitted over the ith time interval, so the
channel is only used at time i if γ0 ≤ γ[i] < ∞.
(D)
Then the capacity formula:
The multiplexing nature of the capacity-achieving coding strategy indicates that capacity
(D) is achieved with a time-varying data rate, where the rate corresponding to
instantaneous SNR γ is Blog2(γ/γ0). Since γ0 is constant, this means that as the instantaneous
SNR increases, the data rate sent over the channel for that instantaneous SNR also increases.
Note that this multiplexing strategy is not the only way to achieve capacity (D): it can also
be achieved by adapting the transmit power and sending at a fixed rate. 32
Case 3: Shannon Capacity (CDI and CSI both known to both TX and RX)
Since γ is time-varying, the maximizing power adaptation policy of (C) is a “water-
filling” formula in time. The curve below shows how much power is allocated to the
channel for instantaneous SNR γ(t) = γ. The water-filling terminology refers to the fact
that the line 1/γ sketches out the bottom of a bowl, and power is poured into the bowl to
a constant water level of 1/γ0. The amount of power allocated for a given γ equals (1/γ0
− 1/γ), the amount of water between the bottom of the bowl (1/γ) and the constant water
line (1/γ0).
33
Case 3: Shannon Capacity (CDI and CSI both known to both TX and RX)
Example 4.4: Assume the same channel as in the previous example, with a bandwidth of 30
KHz and three possible received SNRs: γ1 = 0.8333 with p(γ1) =0.1, γ2 = 83.33 with p(γ2) =
0.5, and γ3 = 333.33 with p(γ3) = 0.4. Find the ergodic capacity of this channel assuming both
transmitter and receiver have instantaneous CSI.
34
Case 3: Shannon Capacity (CDI and CSI both known to both TX and RX)
Comparing with the results of the previous example, we see that this rate is only slightly
higher than for the case of receiver CSI only, and is still significantly below that of an
AWGN channel with the same average SNR.
That is because, the average SNR for this channel is relatively high: for low SNR
channels, capacity in flat-fading can exceed that of the AWGN channel with the same
SNR by taking advantage of the rare times when the channel is in a very good state.
35
The END
36
Stationary and Ergodic Process
A random process is strict-sense stationary or simply stationary if its statistical
properties (pdf, cdf, mean, variance, correlation function, etc.) do not change by time.
A random process is
called weak-sense
stationary or wide-sense
stationary (WSS) if its mean
function and its correlation
function do not change by shifts
in time.
A stochastic process Xr[k] is ergodic if the statistics taken along the time index, k, are the same as
the statistics taken along the realization axis, indexed by r.
37