0% found this document useful (0 votes)

3 views6 pages

Notes - BSC

The lecture covers channel coding, focusing on the Binary Symmetric Channel (BSC) and its capacity, defined as Capacity(BSC(p)) = 1 - h(p). It discusses the encoding and decoding process, error probabilities, and the relationship between message length and reliability. Additionally, it introduces general channels and their capacity in terms of mutual information, emphasizing the significance of typical sequences in error analysis.

Uploaded by

Ar T

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views6 pages

Notes - BSC

Uploaded by

Ar T

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

CS 229r Information Theory in Computer Science Feb 14, 2019

Lecture 6
Instructor: Madhu Sudan Scribe: Xingchi Yan

1 Overview
1.1 Outline for today
Channel Coding (Or Error Correction)
• Definitions: Rate, Capacity
• Coding Theorem for Binary Symmetric Channel (BSC)
• Coding Theroem for general channels
• Converse

2 Binary Symmetric Channel

Today we will move to a new topic which is channeling coding. Channeling coding is for correcting errors.
And this is the second part of Shannon’s 48 paper. The first part was about compressing information.
The simplest example is the Binary Symmetric Channel. The BSC(p) has some parameter where 0 ≤
p < 12 . What this channel does in communication is that it sends messages in bit X, what it receives is X
with probability 1 − p and 1 − X with probability p. And this happens independently on every use of the
channel.

X Y = X w.p. 1 − p
Y = 1 − X w.p. p

In order to use such a channel, Shannon’s idea was to encode the information before you send it and
decode it afterwards. Suppose you have a message m that you want to send, you encode it and get some
sequence X, maybe n such symbol X, get Y and then you want to decode.

m Xn Yn m̂
Encode Channel Decode

And what we want is

Pr[m 6= m̂] → 0, as n → ∞
Intuitively, if you want to send longer and longer message, the error probability should be increasing.
But here there is one thing that we want to use length of message to improve the reliability. And now we
would like to understand how many uses of the channel do we have to do? What is the largest R?
The encoding sends k bits to n bits, our message would be some k bit string and encoding turns it
to n bit string. The decoding takes n bits and brings it back down to k bits. En : {0, 1}k → {0, 1}n ,
Dn : {0, 1}n → {0, 1}k where k ≤ n.

CS 229r Information Theory in Computer Science-1

k
Definition 1. The rate would be Rate = n

What we would really want to understand is the capacity of the channel. In this case the capacity of the
channel would be
Definition 2.
Capacity(BSC(p)) = sup{lim lim {Rate R of En , Dn }}
R ε→0 n→∞

such that
Pr [D(y n ) 6= m] ≤ ε
m,y|x,X=E(m)

So what is the best rate you can get? Allow to use n as large as you get but have to make sure that error
goes to zero when n goes to infinity. This is the general quantity we want to understand for any channel.
Remark The capacity of binary symmetric channel is Capacity(BSC(p)) = 1 − h(p) where h(p) =
1
plog p1 + (1 − p)log 1−p which is the entropy of Bernoulli p random variable.

This is a little striking theorem. Why we get the 1 − h(p)? Roughly the idea is the following, X came
in the channel with a sequence of Bernoulli random variables η distributed according to Bern(p), after the
decoding, suppose you are able to reproduce the original message m, then you can easily use this to determine
also η. This channel is sending for free after the decoding n Bernoulli random bits that should require nh(p)
uses of the channel. What is remaining to use is 1 − h(p) and that’s what we’re going to use to convey X.

X X ⊕η m

η ∼ Bern(p)n

If you try to draw the capacity of the channel 1 − h(p) and h(p). When p = 0 you get capacity 1 which
means that for every use of the channel you can send 1 bit through which makes sense. When p = 12 you
get no capacity which also makes sense when you send 0 or 1 the receiver receives unbiased bit so there is
no correlation between sending and receiving.

3 General Channel
It turns out when you are talking about general channel, you get a even nicer connection. We would like
to find the rate and capacity for the general channel. First we would like to specify the encoder for general
channel. For today general channel means memoryless and here we’re just taking some arbitrary channel
which acts independently on every single bit being transmitted.

3.1 Introduction
The input X is some element of some universe Ωx . The output some element of some universe Ωy . The
universe means not to be related. We want think about stochastic channels which make a lot sense that
channel are given by a bunch of conditional distributions.

CS 229r Information Theory in Computer Science-2

X ∈ Ωx y ∈ Ωy
Channel Py|x

Example 3. One very simple example of this may be called erasure channel. The following figure shows a
Binary Erasure Channel(BEC). It produces a output of 0 or 1 or ?.
1−p
0 0

1−p
1 1

Example 4. People in information theory may also think about something called noisy typewriter. The
channel input is either unchanged with probability 12 or transformed into the next letter with probability 12 in
the output.

Exercise 5. Take a binary erasure channel with parameter p and a binary erasure channel with parameter
q and try to find an reasonable relationship between p and q.
Remark The matrix Py|x (α, β) specifies the channel.

Py|x (p|α) = Pr[Channel outputs β ∈ Ωy , inf ormation input α ∈ Ωx ]

Ωy

Ωx Py|x (α, β)

CS 229r Information Theory in Computer Science-3

For the binary symmetry channel case, we have

1−p p

p 1−p

Exercise 6. What’s the rate of this binary symmetry channel?

3.2 Coding Theorem

An encoding function would be a map from {0, 1}n to Ωnx , the decoding function would be a map from Ωny
to {0, 1}k . And rate is still defined as Rate = nk . And we still have the probability of error and we still can
talk about capacity of channel. For any general channel, the capacity is
Definition 7.
Capacity(Channel Py|x ) = sup{lim lim {Rate R of En , Dn }}
R ε→0 n→∞

Here we will get the remarkable theorem once again. It turns out the capacity of the channel are given
by this joint distribution.
Theorem 8.
Capacity(Py|x ) = sup I(x; y)
Px

Remark Once X distribution is specified, I get a joint distribution on (X, Y ). The information that Y
conditioned on X is the capacity of this channel. This completely characterize every memoryless channel of
communication. Given Y, figuring out X. Then mutual information is the right characterization.

Let’s prove half of the statement first, half of that statement next, and then other half. Here is what
we’re going to do with the encode, pick n large enough, let k ≥ (1 − h(p) − ε)n, let En : {0, 1}k → {0, 1}n
be completely random. The decoding function is, not changed, given some sequence Y, we’re going to look
at the m which maximizes the probability y n conditioned on xn , that is,
Dn (β n ) = argmax{Pyn |xn (β n , E(m)}
m

where βn ∈ Ωny .
It could be used to talk any channel, for instance, the Markov channels. The following
theorem is nor hard to prove. We have
Theorem 9.
Pr [Dn (y) 6= m] ≤ ε
En ,m,y|En (m)

⇒ ∃En Pr [Dn (y) 6= m] ≤ ε

m,y|En (m)

Remark
Pr [Dn (BSCp (En (m))) 6= m] ≤ ε
En ,m,BCSp

⇒ ∃En Pr [D(BSCp (E(m))) 6= m] ≤ ε

m,BSCp

Proof: There are two types of errors

CS 229r Information Theory in Computer Science-4

• Error of type 1(E1): Too many error happen. X n and Y n differ in more than (P + ε)n coordinates.
It’s straightforward to show that Pr[E1 ] ≤ exp(−n). (If you don not see Chernoff bounds like this, do
send me a note and I will put some further readings.)
• Error of type 2(E2): Fix message m you just transmitted, fix the encoding of this message En (m),
that is an random variable but it’s fixed and fix y = BSCp (En |m), let En1 : {0, 1}k − {m} → {0, 1}n
is not fixed and be random. E2 is the event that there exists m0 such that encoding of En (m0 ) and y
disagree in less than or equal to (p + ε)n coordinates.
We want neither of these events happen then the most likely message is the one we transmitted.

Pr[Dn (BSC(En (m))) 6= m] ≤ Pr[E1 ∪ E2 ] ≤ Pr[E1 ] + Pr[E2 ]

Pr[E1 ] is exponentially small thus we are left with probability E2 , the bound relys on the probability
of E2 which is simple to calculate.
Lemma 10.
V olume(Ball) 2(h(p)+ε)n
∀m0 , Pr[m0 ∈ Ball of radius (P + ε)n around y] = ≈
2n 2n

Remark
Ballr (y) = {Z ∈ {0, 1}n |Z and y dif f er in ≤ r coordinates}
The volume of this ball or the size of this set is

r
X n r
|Ballr (y)| = ≈ 2h( n )n
i=0
i

That’s just one single message, if you want any valid message

2k 2(h(p)+ε)n
P r[∃m0 , st. E2] ≤
2n
k
This is where we see the quantity that we want, n + h(p) + ε < 1 and why need one minus entropy.
Exercise 11. Show Pr[E1 ] ≤ exp(−n)
That proved part of what we want to prove today. It says about the capacity of the binary symmetric
channel is
Capacity of BSC ≥ lim {1 − h(p) − ε}
ε→0

Now let’s see if we can show the capacity of general channel is at least limε→0 supPx {I(X; Y )} − ε.
We are going to pick n to be large enough and k large enough. We’re going to fix some distribution Px .
Now here we are going to choose the encoding of En (m)i ∼ Px , iid over all (m, i), m ∈ {0, 1}k and i ∈ [n].
The decoding function is still the same maximum likelihood. Now there is a question what is the error of
type 1 and type 2 look like? There’s no notion like error anymore.
So What we’re going to do instead is to start talking about typical sequences. Let’s recall asymptotic
equipartition principle (AEP):
Lemma 12. If Z1 , ..., Zn , Zi ∼ Pz iid, then
1 1
∃S ⊆ Ωnz , ≤ Pr[Z1 ...Zn = r1 ...rn ] ≤
|S|1+ε |S|1−ε
Pr[(Z1 ...Zn ) ∈
/ S] ≤ ε
Here S=“typical set for Pz ”

CS 229r Information Theory in Computer Science-5

We are going to do the usual three step analysis. The first step is to pick the encoding of the message
E(m), the second step is to pick what you received conditioned on what you send y|E(m) and the third
step is to look at E(m0 ) where m0 6= m. There are three kinds of typicality errors. If x = E(m) is not be
typical for Px , we say we have error of type 1. y may not be typical for Py . The most important thing is
we also want (X, Y ) to be typical. But (X, Y ) may not be typical for Pxy . Probability of any of these errors
happens is negligible by AEP.
But we have to get the mutual information out of these. Let’s change the decoding algorithm.
Remark The decoding is some function of β and if β is not typical for Py , then declare an error. If there
exists a unique m such that E(m) is typical for Px and (E(m), y) are typical for Pxy , then output m.

We are going to look at E(m0 ) and want to analyze the probability that E(m0 ) is typical and E(m0 , y)
is also typical. Let’s fix the m0 and ask the question. The fact E(m0 ) is typical is going to happen in a very
high probability. We are more interested in the probability that (E(m0 ), y) is jointly typical. This is the
crucial question. This is actually distributed according to Px × Py .
Here is the lemma that I will state which immediately imply.
Lemma 13. Let Zn ∼ P n ,

Pr[Z n is typical f or some distribution Q] ≤ 2−D(Q||P )n

This is another fundamental reason to understand the divergence between these two distributions. You
can apply to very simple things. In the current case we are looking into D(Pxy ||Px × Py ) = I(X, Y ). When
you combine these two facts, apply here, it turns out that any particular message is going to be jointly
typical, the probability is approximately two to the minus mutual information. It tells the rate is at least
the mutual information for this distribution. Next lecture we will try to prove the upper bounds.

CS 229r Information Theory in Computer Science-6

MTH302 Mcqs FinalTerm by Vu Topper RM
No ratings yet
MTH302 Mcqs FinalTerm by Vu Topper RM
37 pages
BCSL 045 PDF
No ratings yet
BCSL 045 PDF
16 pages
Ciphers
0% (1)
Ciphers
3 pages
Polar Codes
No ratings yet
Polar Codes
58 pages
Differential Equations and Boundary Value Problems Computing and Modeling 5th Edition Edwards Solutions Manual Download
100% (22)
Differential Equations and Boundary Value Problems Computing and Modeling 5th Edition Edwards Solutions Manual Download
152 pages
M&C Questions
No ratings yet
M&C Questions
7 pages
uRLLC Rate
No ratings yet
uRLLC Rate
53 pages
Discrete Memory Less Channel
No ratings yet
Discrete Memory Less Channel
68 pages
Chapter 4
No ratings yet
Chapter 4
89 pages
CH 7
No ratings yet
CH 7
68 pages
NNDL
No ratings yet
NNDL
195 pages
A Comparative Study of Existing Machine Learning Approaches For Parkinson's Disease Detection
No ratings yet
A Comparative Study of Existing Machine Learning Approaches For Parkinson's Disease Detection
12 pages
Publication 6553 PDF
No ratings yet
Publication 6553 PDF
200 pages
Introduction To Information Theory Channel Capacity and Models
No ratings yet
Introduction To Information Theory Channel Capacity and Models
30 pages
Channel Coding: Reliable Communication Through Noisy Channels
No ratings yet
Channel Coding: Reliable Communication Through Noisy Channels
23 pages
Parallel Algorithm by Rc4
No ratings yet
Parallel Algorithm by Rc4
22 pages
Sparse and Low Rank
No ratings yet
Sparse and Low Rank
6 pages
Deep Super Learner: A Deep Ensemble For Classification Problems
No ratings yet
Deep Super Learner: A Deep Ensemble For Classification Problems
12 pages
PSRM 2 Assignement 4
No ratings yet
PSRM 2 Assignement 4
3 pages
National Cipher Challenge: A Beginner's Guide To Codes and Ciphers Part 2, Frequency Analysis
No ratings yet
National Cipher Challenge: A Beginner's Guide To Codes and Ciphers Part 2, Frequency Analysis
14 pages
251 Sample MT2-1
No ratings yet
251 Sample MT2-1
6 pages
Exponential Models: General Mathematics
No ratings yet
Exponential Models: General Mathematics
33 pages
ETN642 Lec8 Ch8 Handouts
No ratings yet
ETN642 Lec8 Ch8 Handouts
12 pages
Itc Ect306 Theory
No ratings yet
Itc Ect306 Theory
17 pages
Coding 515
No ratings yet
Coding 515
92 pages
An Introduction To Information Theory: Adrish Banerjee
No ratings yet
An Introduction To Information Theory: Adrish Banerjee
6 pages
Esit2019 Tal
No ratings yet
Esit2019 Tal
20 pages
Bivariate Data Analysis Olympics Project Lef-2
No ratings yet
Bivariate Data Analysis Olympics Project Lef-2
6 pages
Lecture 15: Channel Capacity, Rate of Channel Code
No ratings yet
Lecture 15: Channel Capacity, Rate of Channel Code
6 pages
Channel Cap
No ratings yet
Channel Cap
9 pages
KNN
No ratings yet
KNN
3 pages
Tutorial 3
No ratings yet
Tutorial 3
4 pages
Experiment 7: AIM: Implementation of Association Technique On ARFF Files Using WEKA Dataset
No ratings yet
Experiment 7: AIM: Implementation of Association Technique On ARFF Files Using WEKA Dataset
3 pages
Channel Capacity: 1 Preliminaries and Definitions
No ratings yet
Channel Capacity: 1 Preliminaries and Definitions
5 pages
Introduction To Information Theory Channel Capacity and Models
No ratings yet
Introduction To Information Theory Channel Capacity and Models
37 pages
DC Slide Module-3
No ratings yet
DC Slide Module-3
16 pages
ch7 PDF
No ratings yet
ch7 PDF
33 pages
Binary and Decimal Numbers Feb0605
No ratings yet
Binary and Decimal Numbers Feb0605
22 pages
Itc0809 Homework7
No ratings yet
Itc0809 Homework7
5 pages
5-2 Information Theory
No ratings yet
5-2 Information Theory
37 pages
Matdid 325610
No ratings yet
Matdid 325610
14 pages
Proving Shannon's Second Theorem
No ratings yet
Proving Shannon's Second Theorem
18 pages
TE361 Channel Coding
No ratings yet
TE361 Channel Coding
65 pages
INGI 2348: Information Theory and Coding
No ratings yet
INGI 2348: Information Theory and Coding
33 pages
15ec54 PDF
No ratings yet
15ec54 PDF
56 pages
Channel Encoding
No ratings yet
Channel Encoding
11 pages
Finite Blocklength Coding For Channels With Side Information at The Receiver
No ratings yet
Finite Blocklength Coding For Channels With Side Information at The Receiver
5 pages
Channel Capacity PDF
No ratings yet
Channel Capacity PDF
30 pages
Introduction To Information Theory Channel Capacity and Models
No ratings yet
Introduction To Information Theory Channel Capacity and Models
30 pages
TE361 Channel Coding 1
No ratings yet
TE361 Channel Coding 1
24 pages
Chapter 7: Channel Capacity
No ratings yet
Chapter 7: Channel Capacity
33 pages
Uea05 2
No ratings yet
Uea05 2
3 pages
Channel Coding I
No ratings yet
Channel Coding I
22 pages
Operations Research
No ratings yet
Operations Research
47 pages
Information Theory: Mohamed Hamada
No ratings yet
Information Theory: Mohamed Hamada
24 pages
Information Theory: Today's Topics
No ratings yet
Information Theory: Today's Topics
4 pages
Different Integration Formulas
No ratings yet
Different Integration Formulas
11 pages
Exercises-13 3
No ratings yet
Exercises-13 3
2 pages
ITC
100% (1)
ITC
13 pages
Discrete Memoryless Channels
No ratings yet
Discrete Memoryless Channels
16 pages
ITC Unit 2 Notes Detailed
No ratings yet
ITC Unit 2 Notes Detailed
3 pages
Communication Systems Channel Capacity
No ratings yet
Communication Systems Channel Capacity
6 pages
Chapter 2
No ratings yet
Chapter 2
12 pages
Channel Capacity
No ratings yet
Channel Capacity
51 pages
Epsc 123
No ratings yet
Epsc 123
2 pages
Introduction To Information Theory Channel Capacity and Models
No ratings yet
Introduction To Information Theory Channel Capacity and Models
36 pages
Noisy Channel Theorem
No ratings yet
Noisy Channel Theorem
6 pages
AOD (Practice Assignment)
No ratings yet
AOD (Practice Assignment)
4 pages
Digital Communication Chapter 3
No ratings yet
Digital Communication Chapter 3
37 pages
T4 NoiseAndMutualInformation
No ratings yet
T4 NoiseAndMutualInformation
8 pages
Channel Capacity and Models
No ratings yet
Channel Capacity and Models
30 pages
Channel Capacity and The Channel Coding Theorem, Part I: Information Theory 2013
No ratings yet
Channel Capacity and The Channel Coding Theorem, Part I: Information Theory 2013
17 pages
APPC 1.6A WKST Polynomial End Behavior
No ratings yet
APPC 1.6A WKST Polynomial End Behavior
2 pages
Math - Test - Criterion A
No ratings yet
Math - Test - Criterion A
5 pages
Lecture 15
No ratings yet
Lecture 15
5 pages
Gaussian Channel: I 2 I I I
No ratings yet
Gaussian Channel: I 2 I I I
26 pages
Channel Coding: - Channel Capacity Channel Capacity, C Is Defined As
No ratings yet
Channel Coding: - Channel Capacity Channel Capacity, C Is Defined As
11 pages
Capacity of The Binary Erasure Channel: 1 Communication
No ratings yet
Capacity of The Binary Erasure Channel: 1 Communication
4 pages
Rohini 15720602071
No ratings yet
Rohini 15720602071
2 pages
ECE 771 Lecture 10 - The Gaussian Channel
No ratings yet
ECE 771 Lecture 10 - The Gaussian Channel
9 pages
01-Syllabus and Intro
No ratings yet
01-Syllabus and Intro
21 pages
Control Systems1
No ratings yet
Control Systems1
19 pages
Advanced Vibrations - S. Graham Kelly
No ratings yet
Advanced Vibrations - S. Graham Kelly
5 pages
KMÜ 308 Take Home Examination - 2021-Spring
No ratings yet
KMÜ 308 Take Home Examination - 2021-Spring
1 page
Channel Encoding
No ratings yet
Channel Encoding
5 pages
Capacity BEC Z
No ratings yet
Capacity BEC Z
6 pages
2021 Past Paper
No ratings yet
2021 Past Paper
13 pages
Introduction To Information Theory Channel Capacity and Models
No ratings yet
Introduction To Information Theory Channel Capacity and Models
36 pages
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
From Everand
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Gérard Blanchet
3/5 (1)
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)

Notes - BSC

Uploaded by

Notes - BSC

Uploaded by

CS 229r Information Theory in Computer Science Feb 14, 2019

2 Binary Symmetric Channel

And what we want is

CS 229r Information Theory in Computer Science-1

CS 229r Information Theory in Computer Science-2

Py|x (p|α) = Pr[Channel outputs β ∈ Ωy , inf ormation input α ∈ Ωx ]

CS 229r Information Theory in Computer Science-3

Exercise 6. What’s the rate of this binary symmetry channel?

3.2 Coding Theorem

⇒ ∃En Pr [Dn (y) 6= m] ≤ ε

⇒ ∃En Pr [D(BSCp (E(m))) 6= m] ≤ ε

Proof: There are two types of errors

CS 229r Information Theory in Computer Science-4

Pr[Dn (BSC(En (m))) 6= m] ≤ Pr[E1 ∪ E2 ] ≤ Pr[E1 ] + Pr[E2 ]

CS 229r Information Theory in Computer Science-5

Pr[Z n is typical f or some distribution Q] ≤ 2−D(Q||P )n

CS 229r Information Theory in Computer Science-6

You might also like