Information Coding
Information Coding
Adrish Banerjee
Department of Electrical Engineering
Indian Institute of Technology Kanpur
Kanpur, Uttar Pradesh
India
Textbooks:
• James L. Massey, Lecture notes on “Applied Digital Information The-
ory I”. (http://www.isiweb.ee.ethz.ch/archive/massey scr/)
• Thomas M. Cover, Joy A. Thomas, “Elements of Information Theory”,
2nd Edition, John Wiley & Sons, 2006.
For the first textbook, lecture notes are available at the given link, which
are a very nice set of lecture notes on Information theory.
References:
• Robert G. Gallager, “Information Theory and Reliable Communica-
tions”, John Wiley & Sons, 1968.
1
• Raymond W. Yeung, “Information Theory and Network Coding”, Springer,
2008.
So let us now talk about what are the course content for this particular
course. So, we will first start of with quantifying information, what is infor-
mation and how do we quantify information. So, we will talk about heartless
measure of information and Shannon measure of information and then we
will talk about entropy, relative entropy, mutual information and their prop-
erties. After that we will talk about some information inequalities, Jensen’s
inequalities, log sum inequality and some other results such as fano’s lemma
which we will be using subsequently later on to prove some other results.
Then we will move into the problem of source compression source coding.
So, source coding can be from block to variable length coding, variable to
block length coding, block to block length coding and variable to variable
length coding.
So, that depends on whether the input sequences of fixed length if you are
calling a block or input length is variable, that is this block and this block
to variable whether the output is of variable length or fixed length. So, we
can get compression if we take block of data, we can get compression by
representing blocks which are happening more frequently by lesser number
of bits and blocks which are happening less frequently by larger number of
bits. So, in that way we can reduce the number of bits required to represent
the source.
2
So, we will talk about optimal blocks to variable length coding, which is your
Huffman coding, we will prove the conditions for optimality what is the mini-
mum number of bits to required to represent the code. We will also talk about
variable to block length coding and for a specific instance where the message
can be passed in a particular fashion, we will talk about an optimal variable
to block length coding in that particular instance which is what is known as
Tunstall coding. We will also talk about variable to variable length coding
which are very prominently used in lot of source compression algorithms. So,
in particular we will talk about arithmetic coding and Lempel-Ziv algorithm,
which comes under class of universal source compression algorithm, you do
not require any knowledge about the prior source distribution of the source
that you are trying to compress. We will also talk about block to block length
coding, now if we are doing block to block length coding to get compression;
obviously, we are encountering a lossy compression.
Now we will show that there are some sequences which are more likely to
happen, given a source distribution there are some sequences which are more
likely to happen from that particular source and some sequences are less
likely to happen. So, when we do block to block length coding we will try to
encode, assign individual code words to all those sequences which are more
likely to happen, which we call as typical sequences and of course for all
other non typical sequences we will just assign one particular code word. So,
when we transmit a typical sequence, the decoder will be able to decode that
sequence where as for the non typical sequence we would not be able to dis-
tinguish what sequences was transmitted and hence this is the example of a
lossy source compression. We will also talk about what is collectively known
as Asymptotic Equipartition Property, this is information theory analog of
loss large numbers; we will talk about what are these properties and what is
the consequence of these properties.
Then we will move to source compression for sources which have memory. So,
we will exploit basically temporal correlation between bits that are coming
out of source we are going to exploit that correlation to design ours source
encoder. So, this will be our coding for sources with memory and we take
a very simple example to illustrate that. After this few lectures on source
compression we will now move to channel capacity computation. So, chan-
nel is the medium over which we are communicating, how many bits we can
transmit over communication link that basically channel capacity. So, we
will talk about some very simple channel models then we will talk about how
to compute the channel capacity. Then we will move from discrete random
3
variable to continuous random variable and we will define entropy for contin-
uous random variable, that is basically what is known as differential entropy
and we will consider an example of very commonly used channel which is
addictive white Gaussian noise channel, we will compute the capacity of ad-
dictive white Gaussian noise channel.
Next topic deals with what is known as rate distortion theory. Now if we
have a real number and you try to represent that real number, if you want
to represent it you require infinite number of bits right, but if you try to
represent that real number with finite number of bits then essentially you
are introducing some sort of distortion. So, how do you find out that your
representation using fixed number of bits is a good representation of a real
number? So, you need to describe a goodness measure between your origi-
nal real number and its representation that is basically known as distortion
measure. Now rate distortion theory deals with if you have a source and a
given distortion measure you are interested in knowing for example, what is
the minimum average distortion that can be achieved for a given rate.
So, given distortion measure we are interested in basically finding out or we
can alternatively say that given a distortion average distortion measure you
want to find out what is the minimum rate that you can achieve. And finally,
if time permits we are going to talk about network information theory. Now
earlier here when I talked about channel capacity I am going to talk about
point to point channel. So, there is 1 transmitter and 1 receiver. So, we are
going to characterize a capacity of such channels. Now think of scenarios
when you have multiple senders and multiple receivers, how do you find out
capacity of such channels or let us say you have multiple senders multiple
receivers and you want to do distributed source compression. So, all these
problems come under this network information theory. So, this is roughly
the syllabus that we plan to cover in this 20 hour lectures, spanned over 8
weeks.
4
• Variable to variable length coding: Arithmetic codes, Lempel-Ziv codes
• Block to block length coding: Typical sequences
• Asymptotic Equipartition Property
• Coding for sources with memory
• Channel capacity
• Differential Entropy
• Gaussian Channel
• Rate Distortion Theory
• Network Information Theory
5
We know that all communication systems include these basic steps.
6
For the first step for digital communication system we will take an exam-
ple to illustrate, how we will do encoding of information. A bag contains
50% black balls, 25% red balls, 12.5% blue balls, 12.5% green balls. You are
randoming picking a ball from the bag and want to convey the information
about the color of the ball.
Simple encoding (Dumb way!), black=00, red=01, blue=10, green=11. An
average of 2.0 bits/color
Smart way?We use the statistical structure of a source to represent its out-
put efficiently. black=0, red=10, blue=110, green=111.An average of 1.75
bits/color.If we are interested in source compression, we will try to represent
sources which occurs more frequently with lesser number of bits and other
sources which occur less frequently with more number of bits.
Can you figure out the color of the balls from the sequence 0110100111?
Black, blue, red, black, green. As it is a prefix free code, we are able to make
out when is a particular code is ending.
Main principle of data compression: “Only infrmatn esentil to understnd mst
b tranmitd.”
7
Let us look at the block diagram of a communication system.
Transmitter
Channel
Receiver
^
u Channel r
Sink Source Decryption Digital
Decoder Decoder Demodulation
8
• Channel: The physical transmission medium; it can be wireless or
wireline. It corrupts transmitted waveforms due to various effects such
as noise, interference, fading, and multipath transmission. Examples:
Binary erasure channel (BEC), Additive white Gaussian noise (AWGN)
channel.
• Decryption: To recover the plain text from the cipher text with the
help of key. It is in the key that the security of a modern cipher lies,
not in the details of the cipher.