Channel Coding Theorem
Channel Coding Theorem
F A C U L T Y OF E N G I N E E R I N G
Spring 2016
MP 219 Mathematics 9
(Probability and Random Processes)
Dr. Sherif Rabia
Eng. Sara Kamel
Channel Coding Theorem
Page 1 of 23
Team Members
----------------------Seat
Number
Name
72
73
149
178
214
236
Page 2 of 23
Index
---------Entropy definition . 4
Source coding .. 7
Mutual information .. 10
Channel capacity ... 12
Channel coding theorem ..... 14
Matlab .... 19
Sources ... 21
Introduction
Page 3 of 23
overview
Page 4 of 23
Entropy definition
------------------------ Shannon Information Content
The Shannon Information Content of an outcome with probability p is
I =log 2
1
p(x )
Entropy
Definition: The entropy is a measure of the average uncertainty in the random variable.
It is the number of bits on average of Shannon Information Content required to describe the
random variable.
The entropy H(X) of a discrete random variable X is defined by
H ( x )= p (x) log 2
x
1
p(x )
X
Px
0
1-p
1
p
def H(p).
H(X) depends only on the probability mass function px not on RV X so we can write H(p).
Joint entropy
We now extend the definition to a pair of random variables X & Y.
Page 6 of 23
Definition: The joint entropy H(X, Y) of a pair of discrete random variables (X, Y) with a
joint distribution p(x, y) is defined as
H ( x , y )= p( x , y) log 2
x, y
1
p(x , y)
Conditional entropy
1
p( yx )
Source coding
---------------------- Introduction
The information coming from the source can be characters, if the information
source is a text. It would be pixels if the information source is an image. So if I
Page 7 of 23
want to transmit pixels or characters, how could I do that? Well, this is done
using source code.
Source coding is a mapping from (a sequence of) symbols from an
information source to a sequence of bits.
This is the concept behind data compression.
Source coding tries to make a minimal code length and helps to get rid of
undesired or unimportant extra information.
The channel that will receive the code may not have the capacity to
communicate at the source information rate. So, we use source coding to
represent source at a lower rate with some loss of information.
Code length
Therere two types of codes: fixed length code and variable-length code.
Page 8 of 23
1. Fixed length code, as its name clarifies, has symbols that have the
same number of bits.
2. Variable length code has symbols that have different number of bits
depending on the probability of each symbol.
Variable length code is the better solution as it allows the minimal code
length.
log r
1
p
the code (2 in case of binary code) and p is the probability of the symbol.
Lets take an example to make things clearer.
Theres a source that generates three symbols: S1, S2 and S3. The
probabilities of S1, S2 and S3 are 0.3, 0.5 and 0.2 respectively.
By applying the formula, well obtain the following results.
I 1 = 1.7,
I2
= 1 and
I 3 = 2.3.
But, as you may notice, these results are theoretical as theres no 1.7 bit. So
to make it practical we shall approximate them. Thus, the results will be as
following:
I 1 = 2,
I2
= 1 and
I 3 = 3.
As weve mentioned before, we shall notice that the symbol of the largest
probability (S2) will have the shortest length (only one bit).
So if the source information generates the following code:
Page 9 of 23
S1 S2 S1 S3 S2 S2 S1 S2 S3 S2
The source coding will generate 17 bits (2 bits + 1 bit + 2 bits + 3 bits + 1 bit
+ 1 bit + 2 bits + 1 bit + 3 bits + 1 bit).
Page 10 of 23
Mutual information
---------------------- Definition
Mutual information is one of many quantities that measures how much one
random variable tells us about another. It can be thought of as the reduction
in uncertainty about one random variable given knowledge of another.
Intuitively, mutual information measures the information that X and Y share:
it measures how much knowing one of these variables reduces uncertainty
about the other. For example, if X and Y are independent, then
knowing X does not give any information about Y and vice versa, so their
mutual information is zero. At the other extreme, if X is a deterministic
function of Y and Y is a deterministic function of X then all information
conveyed by X is shared with Y: knowing X determines the value of Y and vice
versa. High mutual information indicates a large reduction in uncertainty; low
mutual information indicates a small reduction; and zero mutual information
between two random variables means the variables are independent. An
important theorem from information theory says that the mutual information
between two variables is 0 if and only if the two variables are statistically
independent.
For example, suppose X represents the roll of a fair 6-sided die, and Y
represents whether the roll is even (0 if even, 1 if odd). Clearly, the value of Y
tells us something about the value of X and vice versa. That is, these
variables share mutual information.
On the other hand, if X represents the roll of one fair die, and Z represents
the roll of another fair die, then X and Z share no mutual information. The roll
of one die does not contain any information about the outcome of the other
die.
Mathematical representation
Page 11 of 23
PXY (x , y )
I(X;Y)=
x,y
marginal
PXY ( x , y )
PX(x)=
And
PY(y)=
PXY (x , y )
x
To understand what I(X;Y) actually means, lets modify the equation first.
I(X;Y)= H(X)H(X|Y), where
H(X)=
PX (x)log PX (x)
x
and
PX Y ( x y )log (PXY ( x y ))
x
H(X|Y)=
PY ( y)
Page 12 of 23
S
N
) bits/s
Where
C: Channel capacity
B: Channel bandwidth
S: Signal power
N: Noise power
S
N
Hence for a given average transmitted power [S] and channel bandwidth [B]
we can transmit information at rate [C bits/s] without any error.
Page 14 of 23
Its not possible to transmit information at any other rate higher than [C
bits/s] without having a definite probability of error. Hence the channel
capacity theorem defines the fundamental limit on the rate of error-free
transmission for a power-limited, band-limited channel.
sound) and send it three times. At the receiver we will examine the three
repetitions bit by bit and take a majority vote. The twist on this is that we
don't merely send the bits in order. We interleave them. The block of data bits
is first divided into 4 smaller blocks. Then we cycle through the block and
send one bit from the first, then the second, etc. This is done three times to
spread the data out over the surface of the disk. In the context of the simple
repeat code, this may not appear effective. However, there are more powerful
codes known which are very effective at correcting the "burst" error of a
scratch or a dust spot when this interleaving technique is used.
A number of algorithms are used for channel coding we will discuss some of
them which are linear. First let`s explain some definitions.
Block codes: In coding theory, a block code is any member of the large and
important family of error-correcting codes that encode data in blocks. There is
a vast number of examples for block codes, many of which have a wide range
of practical applications. Block codes are conceptually useful because they
allow coding theorists, mathematicians, and computer scientists to study the
limitations of all block codes in a unified way. Such limitations often take the
form of bounds that relate different parameters of the block code to each
other, such as its rate and its ability to detect and correct errors.
Page 16 of 23
"If 00010111 is a valid code word, applying a right circular shift gives the
string 10001011. If the code is cyclic, then 10001011 is again a valid code
word. In general, applying a right circular shift moves the least significant bit
(LSB) to the leftmost position, so that it becomes the most significant bit
(MSB); the other positions are shifted by 1 to the right"
General definition:
Let C be a linear code over a finite field GF(q) of block length n. C is called
a cyclic code if, for every code word c=(c1,...,cn) from C, the word
Page 17 of 23
(cn,c1,...,cn-1) in
GF (q)n
again a code word. Because one cyclic right shift is equal to n 1 cyclic left
shifts, a cyclic code may also be defined via cyclic left shifts. Therefore the
linear code C is cyclic precisely when it is invariant under all cyclic shifts.
Parity
Definition:
A parity bit, or check bit is a bit added to the end of a string of binary code
that indicates whether the number of bits in the string with the value one is
even or odd. Parity bits are used as the simplest form of error detecting code.
Parity types:
In the case of even parity, for a given set of bits, the occurrence of bits whose
value is 1 is counted. If that count is odd, the parity bit value is set to 1,
making the total count of occurrences of 1's in the whole set (including the
parity bit) an even number. If the count of 1's in a given set of bits is already
even, the parity bit's value remains 0.
In the case of odd parity, the situation is reversed. For a given set of bits, if
the count of bits with a value of 1 is even, the parity bit value is set to 1
making the total count of 1's in the whole set(including the parity bit) an odd
number. If the count of bits with a value of 1 is odd, the count is already odd
so the parity bit's value remains 0.
If the parity bit is present but not used, it may be referred to as mark
parity (when the parity bit is always 1) or space parity (the bit is always 0).
Parity in Mathematics:
In mathematics, parity refers to the evenness or oddness of an integer, which
for a binary number is determined only by the least significant bit. In
telecommunications and computing, parity refers to the evenness or oddness
of the number of bits with value one within a given set of bits, and is thus
determined by the value of all the bits. It can be calculated via an XOR sum of
the bits, yielding 0 for even parity and 1 for odd parity. This property of being
Page 18 of 23
dependent upon all the bits and changing value, if any one bit changes,
allows for its use in error detection schemes.
Error detection:
If an odd number of bits (including the parity bit) are transmitted incorrectly,
the parity bit will be incorrect, thus indicating that a parity error occurred in
the transmission. The parity bit is only suitable for detecting errors; it cannot
correct any errors, as there is no way to determine which particular bit is
corrupted. The data must be discarded entirely, and re-transmitted from
scratch. On a noisy transmission medium, successful transmission can
therefore take a long time, or even never occur. However, parity has the
advantage that it uses only a single bit and requires only a number of XOR
gates to generate. Hamming code is an example of an error-correcting code.
Parity bit checking is used occasionally for transmitting ASCII characters,
which have 7 bits, leaving the 8th bit as a parity bit.
Hamming code
Hamming code is a linear error-correcting code that encodes four bits of data into
seven bits by adding three parity bits. It is a member of a larger family
of Hamming codes.
They can detect up to two-bit errors or correct one-bit errors without detection of
uncorrected errors. By contrast, the simple parity code cannot correct errors, and
can detect only an odd number of bits in error. Hamming codes are perfect
codes, that is, they achieve the highest possible rate for codes with their block
length and minimum distance of three.
Page 19 of 23
This table describes which parity bits cover which transmitted bits in the
encoded word. For example, p2 provides an even parity for bits 2, 3, 6, and
7. It also details which transmitted by which parity bit by reading the
column. For example, d1 is covered by p1 and p2 but not p3. This table will
have a striking resemblance to the parity-check matrix (H) in the next
section.
Page 20 of 23
Matlab implementation
------------ Hamming Code
Simulation
Huffmann Code
Page 21 of 23
Simulation
Sources
-----------Page 22 of 23
http://coltech.vnu.edu.vn/~thainp/books/Wiley_-_2006__Elements_of_Information_Theory_2nd_Ed.pdf
http://mailhes.perso.enseeiht.fr/documents/SourceCoding_Mailhes.pdf
https://en.wikipedia.org/wiki/Shannon%27s_source_coding_theorem
https://en.wikipedia.org/wiki/Coding_theory#Source_coding
http://www.scholarpedia.org/article/Mutual_information
http://www.ee.ic.ac.uk/hp/staff/dmb/courses/infotheory/info_1.pdf
Page 23 of 23