CE Notes
CE Notes
The function of any communication system is to convey the information from source to
destination.
Discrete message
During one time one message is transmitted. During the next time interval the next from
the set is transmitted.
Memory source
A source with memory for which each symbol depends on the previous symbols.
Memoryless source
Memoryless in the sense that the symbol emitted at any time is independent of previous
choices.
with probabilities
We assume that the symbols emitted by the source during successive signaling intervals are
statistically independent. A source having the properties is described is called discrete
memoryless source, memoryless in the sense that the symbol emitted at any time is
independent of previous choices.
Source alphabet
Symbols or letters
Uncertainty
The amount of information contained in each symbols is closely related to its uncertainty
or surprise.
Here, generally use log2 since in digital communications we will be talking about bits.
The above expression also tells us that when there is more uncertainty(less probability) of the
symbol being occurred then it conveys more information.
Binit 2
Natural unit(nat) e
The amount of information I(sk) produced by the source during an arbitrary signalling
interval depends on the symbol sk emitted by the source at that time. Indeed, I(sk) is a discrete
random variable that takes on the values I(s0), I(s1), · · · , I(sK-1) with probabilities p0, p1, · · · ,
pK-1 respectively.
Problem
1. A DMS has four symbols S1 , S2, S3 S4 with probabilities 0.40, 0.30, 0.20,
0.10
a. Calculate H(φ).
b. Find the amount of information contained in the message S 1S2S3 S4
and S4S3S3 S2 , and compare with the H(φ).
Solution
0 ≤ H(φ) ≤ log2(K)
1. H(φ) = 0, if and only if the probability pk = 1 for some k, and the remaining
probabilities in the set are all zero; this lower bound on entropy corresponds to no uncertainty.
2. H(φ) = log2(K), if and only if pk = 1/ K for all k; this upper bound on entropy
corresponds to maximum uncertainty.
Proof:
H(φ) ≥0.
Since each probability pk is less than or equal to unity, it follows that each term
The term Pk is zero if, and only if, pk = 0 or 1. That is pk =1 for some k and
all the rest are zero.
H(φ) ≤ log2(K)
To prove this upper bound , we make use of a property of the natural logarithm.
To proceed with this proof, consider any two probability distributions {p0, p1, · · · , pk-1 }
and {q0, q1, q2, · · · , qk-1 } on the alphabet φ = {s0, s1, s2, · · · , sk-1 } of a discrete
memoryless source. Then changing to the natural logarithm, we may write
Hence, using the inequality, we get
Where the equality holds only if pk = qk for all k. Suppose we next put
So,
Thus H(φ) is always less than or equal to log2k. This equality holds only if the symbols
are equiprobable.
Consider a discrete memoryless binary source shown defined on the alphabet φ = {0, 1}.
Let the probabilities of symbols 0 and 1 be p0 and 1- p0 respectively.
Information rate
If the source of the message generates messages at the rate of r messages per second,
then the information rate is defined to be
Example problem
An analog signal is bandlimited to B Hz,sampled at the nyquist rate, and the samples are
quantized into four levels. The quantization levels Q1,Q2,Q3,Q4(messages ) are assumed
independent and occur with probabilities P1=P4=1/8 and P2=P3=3/8. Find the information rate
of the source.
Solution
encoding. The device that performs that representation is called a source encoder.
If some source symbols are known to be more probable than others, then the source
code is generated by assigning short code to frequent source symbols, and long code to rare
source symbols.
EX : Morse code, in which the letters and alphabets are encoded into streams of marks
and spaces, denoted as dots “.” And dashes “-“.
Our primary interest is in the development of an efficient source encoder that satisfies
two functional requirements:
2. The source code is uniquely decodable, so that the original source sequence can be
reconstructed perfectly from the encoded binary sequence.
𝐾−1
L = ∑ 𝑘=0 pkIk
In physical terms, the parameter L represents the average number of bits per source
symbol used in the source encoding process. Let Lmin denote the minimum possible value of L.
We then define the coding efficiency of the source encoder as
η = Lmin / L
L ≥ H(φ)
Data Compaction:
3. A source code which represents the output of a discrete memoryless source should be
uniquely decodable.
Prefix Coding
For each finite sequence of symbols emitted by the source, the corresponding sequence
of code words is different from the sequence of code words corresponding to any other source
sequence. For the above mentioned symbol, let the code word be denoted by
Prefix condition
The initial part of the code word is represented by the elements {mk0, mk1, mk2, · · · , mki}
Any sequence made up of the initial part of the code word is called prefix.
Prefix code
1. The Prefix Code is variable length source coding scheme where no code is the prefix
of any other code.
3. But, the converse is not true i.e., all uniquely decodable codes may not be prefix
codes.
From 1 we see that Code I is not a prefix code. Code II is a prefix code. Code III is also
uniquely decodable but not a prefix code. Prefix codes also satisfies Kraft-McMillan inequality
which is given by
Both codes II and III satisfies the Kraft – McMillan inequality, but only code II is a prefix code.
Decoding procedure
1. The source decoder simply starts at the beginning of the sequence and decodes one
codeword at the time.
2. The decoder always starts at the initial state of the tree.
3. The received bit moves the decoder to the terminal state if it is 0,or else to next
decision point if it is 1.
Given a discrete memoryless source of entropy H(φ), a prefix code can be constructed
with an average code-word length l, which is bounded as follows:
H(φ)≤L≤ H(φ)+1
The left hand side of the above equation, the equality is satisfied owing to the condition
that, any symbol sk is emitted with the probability
Procedure
S2 0.25 0 1
01
S3 0.20
1 0 10
S4 0.12
1 1 0 110
S5 0.08
1 1 1 0 1110
S6 0.05
1 1 1 1 1111
2. A DMS has six symbols S1 , S2, S3, S4 with corresponding probabilities, 1/2 , 1/4, 1/8, 1/8,
construct a Shannon – fano code for S.
Sk pk Step 1 Step 2 Step 3 Step 4
S1 1/2 0
0
S2 1/4
1 0 10
S3 1/8
1 1 0 110
S4 1/8
1 1 1 111
www.padeepz.net
3. A DMS has five equally likely symbols S1 , S2, S3 S4, S5 construct a Shannon – fano
code for S.
Huffman coding
Algorithm
1. The source symbols are listed in order of decreasing probability. The two source
symbols of lowest probability are assigned a 0 and a 1. This part of the step is reffered to as a
splitting stage.
2. These two source symbols are regarded as being combined into a new source symbol
with probability equal to the sum of the original probabilities. The probability of the new symbol
is placed in the list in accordance with its value.
3. The procedure is repeated until we are left with a final list of source statistics of only
two for which a 0 and a 1 are assigned.
www.padeepz.net
The code for each source symbol is found by working backward and tracing the
sequence of 0s and 1s assigned to that symbol as well as its successors.
S0, 0.4 00
S1 0.2 10
S2 0.2 11
S3 0.1 010
S4 0.1 011
Drawbacks:
Lempel-ziv coding
3. When applied to English text it achieves 55% in contrast to Huffman coding which achieves
only 43%.
4. Encodes patterns in the text This algorithm is accomplished by parsing the source data
stream
into segments that are the shortest subsequences not encountered previously.
Problems
000101110010100101.........
subsequences stored : 0, 1
The shortest subsequence of the data stream encountered for the first time and not seen before
is 00
subsequences stored: 0, 1, 00
The second shortest subsequence not seen before is 01; accordingly, we go on to write
We continue in the manner described here until the given data stream has been completely
parsed. The
Numerical positions: 1 2 3 4 5 6 7 8 9
Numerical Repre
sentations: 11 12 42 21 41 61 62
Binary encoded
Let X and Y be the random variables of symbols at the source and destination
respectively. The description of the channel is shown in the Figure
The channel is described by an input alphabet
an output alphabet
P= .
Input probability distribution p(xj) , j=1,2,……J-1, the event that the channel input X= xj
Marginal probability distribution of the output random variable Y is obtained by averaging out the
dependence of p(xj, yk) on xj,
– The Channel has two input symbols(x0 = 0, x1 = 1)and two output symbols(y0 = 0, y1 =
1).
If the output Y as the noisy version of the channel input X and H(X) is the uncertainity
associated with X, then the uncertainity about X after observing Y , H(X|Y) is given by
The quantity H(X|Y) is called Conditional Entropy. It is the amount of uncertainity about
the channel input after the channel output is observed. Since H(X) is the uncertainity in channel
input before observing the output, H(X) - H(X|Y) represents the uncertainity in channel input that
is resolved by observing the channel output. This uncertainity measure is termed as Mutual
Information of the channel and is denoted by I(X; Y).
Where the H(Y) is the entropy of the channel output and H(Y/X) is the conditional
entropy of the channel output given the channel input.
Property 1:
I(X; Y) = I(Y;X)
Where the mutual information I(X; Y) is a measure of the uncertainty about the channel input
that is resolved by observing the channel output, and the mutual information I(Y;X) is a measure
of the uncertainty about the channel output that is resolved by sending the channel output.
Proof:
Proof:
We know,
Using the following fundamental inequality which we derived discussing the properties of
Entropy,
Drawing the similarities between the right hand side of the above inequality and the left hand
side of Eq. 13, we can conclude that
Property 2 states that we cannot lose information, on the average, by observing the
output of a channel. Moreover, the mutual information is zero if, and only if, the input and output
symbols of the channel are statistically independent.
Property 3:
The mutual information of a channel is related to the joint entropy of the channel input and
channel output by
Channel Capacity
Channel Capacity, C is defined as ‘the maximum mutual information I(X; Y) in any single
use of the channel(i.e., signaling interval), where the maximization is over all possible input
probability distributions {p(xj)} on X”
C is measured in bits/channel-use, or bits/transmission.
Example:
For, the binary symmetric channel discussed previously, I(X; Y) will be maximum when
Since, we know
Using the probability values in Eq. 3 and Eq. 4 in evaluating Eq. 2, we get
Channel coding
Mapping of the incoming data sequence into channel input sequence. It is performed in
the transmitter by a channel encoder.
Mapping of the channel output sequence into an output data sequence. It is performed in
the receiver by a channel decoder.
– with an alphabet φ
– have capacity C
3. Then if,
There exists a coding scheme for which the source output can be transmitted over the channel
and be reconstructed with an arbitrarily small probability of error. The parameter C / Tc
4. Conversly, if
it is not possible to transmit information over the channel and reconstruct it with an arbitrarily
small probability of error.
Example:
Considering the case of a binary symmetric channel, the source entropy H(Φ) is 1. Hence, from
the above equation, we have
But the ratio Tc / Ts equals the code rate, r of the channel encoder.
r≤C
Hence, for a binary symmetric channel, if r ≤ C, then there exists a code capable of achieving an
arbitrarily low probability of error.
Assumptions:
3. These samples are transmitted in T seconds over a noisy channel, also band-limited
to B hertz.
We refer to Xk as a sample of the transmitted signal. The channel output is mixed with additive
white Gaussian noise(AWGN) of zero mean and power spectral density N0/2. The noise is
band-limited to B hertz. Let the continuous random variables Yk, k = 1, 2, · · · ,K denote samples
of the received signal, as shown by
'
The noise sample Nk is Gaussian with zero mean and variance given by
When a symbol is transmitted from the source, noise is added to it. So, the total power is P + σ2.
1. The variance of sample Yk of the received signal equals P + σ2. Hence, the differential
entropy of Yk is
2. The variance of the noise sample Nk equals σ2. Hence, the differential entropy of Nk is given
by