0% found this document useful (0 votes)
49 views4 pages

Capacity of The Binary Erasure Channel: 1 Communication

This document discusses the capacity of the binary erasure channel (BEC). It begins by introducing Shannon's block diagram of communication and the basic steps of source coding, channel coding, and decoding. It then focuses on defining the BEC and binary symmetric channel. The main goal is to prove that the capacity of the BEC with error probability p is 1-p. It shows that a rate of 1-p is achievable using random coding and that no higher rate is possible, even with feedback. Finally, it briefly introduces Shannon's general channel coding theorem relating capacity to mutual information.

Uploaded by

Geremu Tilahun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views4 pages

Capacity of The Binary Erasure Channel: 1 Communication

This document discusses the capacity of the binary erasure channel (BEC). It begins by introducing Shannon's block diagram of communication and the basic steps of source coding, channel coding, and decoding. It then focuses on defining the BEC and binary symmetric channel. The main goal is to prove that the capacity of the BEC with error probability p is 1-p. It shows that a rate of 1-p is achievable using random coding and that no higher rate is possible, even with feedback. Finally, it briefly introduces Shannon's general channel coding theorem relating capacity to mutual information.

Uploaded by

Geremu Tilahun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Capacity of the Binary Erasure Channel

Electrical Engineering 126 (UC Berkeley)


Spring 2018

1 Communication
This note is about the breakthrough work of Claude Shannon in the 1940s. We begin with
Shannon’s famous block diagram, Figure 1.

Figure 1: Shannon’s block diagram of a communication system.

Suppose that you want to send a message over a noisy channel. The basic steps of a
digital communication system are:

1. The source message is compressed.

2. Redundancy is added to deal with the noise in the channel.

3. The coded message is sent through the communication channel.

On the other end of the channel, the receiver reverses all of the above steps. Shannon showed
that we can design the source coding in 1 and the channel coding in 2 separately and still
transfer information over the communication channel at the optimal rate. For the rest of the
note, we will focus on the channel encoding and decoding for the special case of the binary
erasure channel.

2 Capacity of the Binary Erasure Channel


The two simplest models studied are the binary symmetric channel (BSC) and binary
erasure channel (BEC). The two channels are shown in Figure 2. We will focus on the
BEC, which erases the input to the channel with probability p ∈ (0, 1).
We are interested in the maximum number of bits that the transmitter can send over the
channel per transmission without error.

1
Figure 2: The BEC and the BSC.

Suppose that we have a message of length L and we encode to a message of length n,


where L and n are positive integers. Intuitively, n must be larger than L in order to account
for the erasures in the BEC. In our model of communication, we must send our encoded
message over the BEC without feedback, which means that the receiver is not allowed to
contact us in the middle of the transmission to let us know which bits were erased. Later we
will consider what happens when we allow the receiver to provide feedback.
Assuming that the message goes through the channel without errors, the receiver receives
a string of n bits which provides L bits of information. The ratio R := L/n is therefore called
the rate of the code: the number of bits of information about the source message per symbol
that the receiver receives.
Let X be the input alphabet and Y be the output alphabet. Accompanying any code
is an encoding function fn : X L → X n and a decoding function gn : Y n → X L which
tells us how to decode the output of the channel. For the BEC, the input alphabet is binary,
X := {0, 1}, and the output alphabet is Y := {0, 1, e}, where the symbol e indicates that the
symbol was erased by the channel.
Now, we have to account for the noise in the channel. Let X (n) := (X1 , . . . , Xn ) be the n
bits which are fed into the channel and let Y (n) := (Y1 , . . . , Yn ) be the n bits which are the
output of the channel. The maximum probability of error of the code is
Pe (n) := max P{gn (Y (n) ) 6= x | X (n) = fn (x)}.
x∈X L

Take time to parse the above definition. We are considering the maximum probability that
the decoding function, when applied to the output of the channel, differs from the original
intended message, where the maximum is taken over all choices of the input message.
We say that the rate R is achievable for the channel if for each positive integer n there
exist encoding and decoding functions (fn , gn ) which encode messages of length L(n) := dnRe
to messages of length n, such that Pe (n) → 0 as n → ∞ (asymptotically error-free). The
largest achievable rate of the channel is called the capacity of the channel.
The main goal is to show the following:
Theorem 1. The capacity of the BEC with error probability p is 1 − p.
Proof. First, we will show that we can do no better than rate 1 − p. Indeed, even with
feedback (the receiver notifies the transmitter about exactly which bits were erased by the

2
channel), the best that the transmitter can do is to resend the bits which were erased. Since
the channel erases a fraction p of the input bits, the reliable rate of communication is 1 − p
bits per channel use. 1
Next, we will show that we can achieve a rate of R := 1 − p −  for any  > 0. Shannon’s
insight was to leverage the SLLN to achieve capacity. How do we generate a good codebook?
Flip n2L(n) fair coins independently, and fill in an n × 2L(n) codebook accordingly (thus each
of the 2L(n) possible messages are associated with a codeword of length n).

c1 · · · c2L(n)
bit 1
..
.
bit n

Figure 3: We fill in a n × 2L(n) table. The columns represent the codewords c1 , . . . , c2L(n) (one
codeword per possible message) and the rows represent the individual bits of the codewords.

Since the channel is a BEC, a fraction p of the bits transmitted will be erased (by the
SLLN). Suppose that the first codeword is sent. The receiver then gets the first codeword with
a fraction p of its bits erased. Assume WLOG that the first bn(1 − p)c symbols came through
(this is fine because the encoder does not know which bits were erased so it does not affect the
coding). The receiver now looks at the codebook truncated to the first bn(1 − p)c rows and
sees if there is a unique codeword matching the bits that were received. The decoding rule is
that the decoder looks for a unique match in the codebook, and if a unique match does not
exist, an error is declared. Thus, the probability of error is the probability that there exist ≥ 2
entries in the truncated codebook which match the received bits. If the truncated codewords
are denoted c1 , . . . , c2L(n) , then consider codeword c2 ; we have P(c1 = c2 ) = 2−bn(1−p)c . Hence,
L(n) L(n)
2[  2X
P(error) = P {c1 = ci } ≤ 2−bn(1−p)c = 2L(n) · 2−bn(1−p)c ∼ 2−n(1−p−R) .
i=2 i=2

We now examine the exponent and note that as n → ∞, since R < 1 − p, our error goes to 0
exponentially fast.
We now have a sufficient scheme to achieve capacity, but what is the drawback? Decoding
in this manner requires exhaustive search over a massive codebook, so it is practically useless.
Thus, one needs implementable and fast codes to achieve capacity in practice.

3 The General Channel Coding Theorem


We will not cover the general result in this course, but we include it here for those who are
interested.
1
This is known as an oracle argument because it proves a fundamental limit on what can be achieved
by showing that you can do no better, even if you had an oracle supplying you with extra knoweldge.

3
The general result is stated in terms of the mutual information of random variables,
which is defined as I(X; Y ) := H(X) + H(Y ) − H(X, Y ). Let X denote the source alphabet
of a channel, and let Y denote the corresponding output alphabet. Let X be the input to the
channel (one transmission), and let Y be the output of the channel. Finally, let P denote the
set of probability distributions on the input alphabet X . The channel capacity is

C := max I(X; Y ). (1)


p∈P, X∼p

In words, we are looking for the largest possible mutual information between the input and
output random variables, where the maximization is taken over all possible input distributions.
This new definition does not conflict with our earlier definition of the capacity of the channel,
because of the following famous result:

Theorem 2 (Channel Coding Theorem). Any rate below the channel capacity C (as defined
in (1)) is achievable. Conversely, any sequence of codes with Pe (n) → 0 as n → ∞ has a rate
R ≤ C. Thus, the two definitions of the channel capacity which we have given agree.

The general result is more difficult to prove than the special case of the BEC, but the
BEC example already carries most of the intuition.

You might also like