0% found this document useful (0 votes)

14 views14 pages

Module-3 Information Theory: Entropy Source-Coding Theorem

Uploaded by

Sahana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views14 pages

Module-3 Information Theory: Entropy Source-Coding Theorem

Uploaded by

Sahana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 14

Module-3

Information theory
Entropy
REFER NOTES

Source-coding Theorem
 The process by which this representation is accomplished is called source encoding.
 The device that performs the representation is called a source encoder. For reasons to be
described, it may be desirable to know the statistics of the source.
 In particular, if some source symbols are known to be more probable than others, then we may
exploit this feature in the generation of a source code by assigning short codeword’s to frequent
source symbols, and long codeword’s to rare source symbols.
 We refer to such a source code as a variable-length code. The Morse code, used in telegraphy in
the past, is an example of a variable-length code.
 Our primary interest is in the formulation of a source encoder that satisfies two requirements:
1. The codeword’s produced by the encoder are in binary form.
2. The source code is uniquely decodable, so that the original source sequence can be
reconstructed perfectly from the encoded binary sequence. The second requirement is
particularly important: it constitutes the basis for a perfect source code.

 Consider then the scheme shown in Figure 5.3 that depicts a discrete memoryless source
whose output sk is converted by the source encoder into a sequence of 0s and 1s, denoted by
bk.
 We assume that the source has an alphabet with K different symbols that the kth symbol sk
occurs with probability pk , k = 0, 1,… K – 1.
 Let the binary codeword assigned to symbol sk. by the encoder have length lk , measured in
bits. We define the average codeword length L of the source encoder as
 According to this theorem, the entropy H(S) represents a fundamental limit on the average number of
bits per source symbol necessary to represent a discrete memoryless source, in that it can be made as
small as but no smaller than the entropy H(S).
 Lmin = H(S), we may rewrite (5.19), defining the efficiency of a source encoder in terms of the entropy
H(S) as shown by

Lossless Data Compression Algorithms

 A common characteristic of signals generated by physical sources is that, in their natural form, they
contain a significant amount of redundant information, the transmission of which is therefore
wasteful of primary communication resources.
 For efficient signal transmission, the redundant information should, therefore, be removed from the
signal prior to transmission.
 This operation, with no loss of information, is ordinarily performed on a signal in digital form, in
which case we refer to the operation as lossless data compression
 The code resulting from such an operation provides a representation of the source output that is
not only efficient in terms of the average number of bits per symbol, but also exact in the sense that
the original data can be reconstructed with no loss of information.
 The entropy of the source establishes the fundamental limit on the removal of redundancy from the
data.
 Basically, lossless data compression is achieved by assigning short descriptions to the most frequent
outcomes of the source output and longer descriptions to the less frequent ones.

Prefix Coding
REFER NOTES

Huffman Coding
 The basic idea behind Huffman coding is the construction of a simple algorithm that computes
an optimal prefix code for a given distribution, optimal in the sense that the code has the
shortest expected length.
 The end result is a source code whose average codeword length approaches the fundamental
limit set by the entropy of a discrete memoryless source, namely H(S).
 The essence of the algorithm used to synthesize the Huffman code is to replace the
prescribed set of source statistics of a discrete memoryless source with a simpler one. This
reduction process is continued in a step-by-step manner until we are left with a final set of
only two source statistics (symbols), for which (0, 1) is an optimal code
To be specific, the Huffman encoding algorithm proceeds as follows:
 The source symbols are listed in order of decreasing probability. The two source
symbols of lowest probability are assigned 0 and 1. This part of the step is referred to
as the splitting stage.
 These two source symbols are then combined into a new source symbol with
probability equal to the sum of the two original probabilities.The probability of the
new symbol is placed in the list in accordance with its value.
 The procedure is repeated until we are left with a final list of source statistics
(symbols) of only two for which the symbols 0 and 1 are assigned.
 The code for each (original) source is found by working backward and tracing the
sequence of 0s and 1s assigned to that symbol as well as its successors.

Huffman Coding problems

REFER NOTES

Lempel–Ziv Coding
 A drawback of the Huffman code is that it requires knowledge of a probabilistic model of the
source; unfortunately, in practice, source statistics are not always known a priori.
 Moreover, in the modeling of text we find that storage requirements prevent the Huffman code
from capturing the higher-order relationships between words and phrases because the
codebook grows exponentially fast in the size of each super-symbol of letters (i.e., grouping of
letters); the efficiency of the code is therefore compromised.
 To overcome these practical limitations of Huffman codes, we may use the Lempel–Ziv
algorithm, which is intrinsically adaptive and simpler to implement than Huffman coding.
 Basically, the idea behind encoding in the Lempel–Ziv algorithm is described as follows: The
source data stream is parsed into segments that are the shortest subsequences not
encountered previously.
Discrete Memoryless Channels
 A discrete memoryless channel is a statistical model with an input X and an output Y that is a noisy

Every unit of time, the channel accepts an input symbol X selected from an alphabet 𝒳 and, in response,
version of X; both X and Y are random variables.

it emits an output symbol Y from an alphabet 𝒴.



 The channel is said to be “discrete” when both of the alphabets 𝒳 and 𝒴 have finite sizes.
 It is said to be “memoryless” when the current output symbol depends only on the current input symbol
and not any previous or future symbol.
 Figure 5.7a shows a view of a discrete memoryless channel. The channel is described in terms of an input

𝒳= {x0, x1,… Xj-1}

alphabet

𝒴={y0, y1,… xK-1}

and an output alphabet
Mutual Information

Properties of Mutual Information

Channel Capacity
 The concept of entropy prepared us for formulating Shannon’s first theorem: the source-coding
theorem. To set the stage for formulating Shannon’s second theorem, namely the channel-coding
theorem, this section introduces the concept of capacity, which, as mentioned previously, defines
the intrinsic ability of a communication channel to convey information.


 Accordingly, we make the following statement:

The channel capacity of a discrete memoryless channel, commonly denoted by C, is
defined as the maximum mutual information I(X;Y) in any single use of the channel
(i.e., signaling interval), where the maximization is over all possible input probability
distributions {p(xj )} on X.
The channel capacity is clearly an intrinsic property of the channel.
Channel-coding Theorem
 first recognize that the inevitable presence of noise in a channel causes discrepancies
(errors) between the output and input data sequences of a digital communication
system.
 For a relatively noisy channel (e.g., wireless communication channel), the probability of
error may reach a value as high as 10–1, which means that (on the average) only 9 out of
10 transmitted bits are received correctly.
 For many applications, this level of reliability is utterly unacceptable. Indeed, a
probability of error equal to 10–6 or even lower is often a necessary practical
requirement. To achieve such a high level of performance, we resort to the use of
channel coding.
 The design goal of channel coding is to increase the resistance of a digital communication
system to channel noise.
 Specifically, channel coding consists of mapping the incoming data sequence into a
channel input sequence and inverse mapping the channel output sequence into an
output data sequence in such a way that the overall effect of channel noise on the
system is minimized.
 The first mapping operation is performed in the transmitter by a channel encoder,
whereas the inverse mapping operation is performed in the receiver by a channel
decoder, as shown in the block diagram of Figure 5.11; to simplify the exposition, we
have not included source encoding (before channel encoding) and source decoding (after
channel decoding) in this figure


 The channel encoder and channel decoder in Figure 5.11 are both under the designer’s control
and should be designed to optimize the overall reliability of the communication system.
 The approach taken is to introduce redundancy in the channel encoder in a controlled manner,
so as to reconstruct the original source sequence as accurately as possible.
 In a rather loose sense, we may thus view channel coding as the dual of source coding, in that
the former introduces controlled redundancy to improve reliability whereas the latter reduces
redundancy to improve efficiency.
 To state Shannon’s second theorem, the channel-coding theorem,10 in two parts as follows:
1. Let a discrete memoryless source with an alphabet 𝒮have entropy H(S) for random variable
S and produce symbols once every Ts seconds. Let a discrete memoryless channel have
capacity C and be used once every Tc seconds, Then, if
1.1
there exists a coding scheme for which the source output can be transmitted over the
channel and be reconstructed with an arbitrarily small probability of error.
2. Conversely, if

it is not possible to transmit information over the channel and reconstruct it with an
arbitrarily small probability of error.
 The channel-coding theorem is the single most important result of information theory. The
theorem specifies the channel capacity C as a fundamental limit on the rate at which the
transmission of reliable error-free messages can take place over a discrete memoryless channel.
However, it is important to note two limitations of the theorem:
1. The channel-coding theorem does not show us how to construct a good code. Rather, the
theorem should be viewed as an existence proof in the sense that it tells us that if the
condition of (1.1) is satisfied, then good codes do exist.
2. The theorem does not have a precise result for the probability of symbol error after
decoding the channel output. Rather, it tells us that the probability of symbol error tends to
zero as the length of the code increases, again provided that the condition of (1.1) is
satisfied.

Information Capacity Law

 For evaluation of the information capacity C, we now proceed in three stages:

Zhi-Hua Zhou (Auth.) - Machine Learning (2021, Springer) (10.1007 - 978-981!15!1967-3) - Libgen - Li
100% (1)
Zhi-Hua Zhou (Auth.) - Machine Learning (2021, Springer) (10.1007 - 978-981!15!1967-3) - Libgen - Li
460 pages
Information Coding Techniques
0% (2)
Information Coding Techniques
374 pages
Polynomial Function Graphing
100% (1)
Polynomial Function Graphing
19 pages
Solution: Homework#1
No ratings yet
Solution: Homework#1
7 pages
DSP Chapter8 PDF
No ratings yet
DSP Chapter8 PDF
66 pages
Optimization in ChemicalEngiereneering TOC
No ratings yet
Optimization in ChemicalEngiereneering TOC
7 pages
Information and Coding Theory
No ratings yet
Information and Coding Theory
177 pages
Digital Signal Processing - 13EC302
No ratings yet
Digital Signal Processing - 13EC302
3 pages
DSP ch08 S2.3 2.7P PDF
No ratings yet
DSP ch08 S2.3 2.7P PDF
57 pages
Information Theory and Coding PDF
No ratings yet
Information Theory and Coding PDF
150 pages
Information Theory and Coding: Universit' A Degli Studi Di Siena Facolt'a Di Ingegneria
No ratings yet
Information Theory and Coding: Universit' A Degli Studi Di Siena Facolt'a Di Ingegneria
156 pages
Information Theory Coding 6 Sem Ec Notes
91% (22)
Information Theory Coding 6 Sem Ec Notes
174 pages
Proof To Shannon's Source Coding Theorem
No ratings yet
Proof To Shannon's Source Coding Theorem
5 pages
EE 376A: Information Theory: Lecture Notes
No ratings yet
EE 376A: Information Theory: Lecture Notes
75 pages
DC Handouts
No ratings yet
DC Handouts
51 pages
15ec54 PDF
No ratings yet
15ec54 PDF
56 pages
Lossless Compression: Huffman Coding: Mikita Gandhi Assistant Professor Adit
No ratings yet
Lossless Compression: Huffman Coding: Mikita Gandhi Assistant Professor Adit
39 pages
Mws Ind Ode TXT Runge4th Examples PDF
No ratings yet
Mws Ind Ode TXT Runge4th Examples PDF
6 pages
Img Compression FT
No ratings yet
Img Compression FT
35 pages
HW1 250719
100% (1)
HW1 250719
6 pages
Chapter 2 - Edited
No ratings yet
Chapter 2 - Edited
45 pages
20EC3305 - PTRP - Assignment 2 Questions - 2022-23
No ratings yet
20EC3305 - PTRP - Assignment 2 Questions - 2022-23
2 pages
The Information Theory: C.E. Shannon, A Mathematical Theory of Communication'
No ratings yet
The Information Theory: C.E. Shannon, A Mathematical Theory of Communication'
43 pages
Kidneysegmentation Matlab
No ratings yet
Kidneysegmentation Matlab
12 pages
Itc Term1
No ratings yet
Itc Term1
78 pages
Machine Learning Performance Evaluation Report
No ratings yet
Machine Learning Performance Evaluation Report
40 pages
Clustering L7
No ratings yet
Clustering L7
7 pages
Chapter 1
No ratings yet
Chapter 1
22 pages
Unit 2
No ratings yet
Unit 2
30 pages
Unit 4 - DC - 2023-2024
No ratings yet
Unit 4 - DC - 2023-2024
100 pages
Apio2009 Solutions
No ratings yet
Apio2009 Solutions
5 pages
Data Compression Basics: Discrete Source
No ratings yet
Data Compression Basics: Discrete Source
34 pages
Information Coding Techniques
No ratings yet
Information Coding Techniques
42 pages
Chapter Four Signal and Systems
No ratings yet
Chapter Four Signal and Systems
21 pages
DC Lecture Slides 1 - Information Theory
No ratings yet
DC Lecture Slides 1 - Information Theory
22 pages
Ch. 2 Source Coding-Ppt1 PDF
No ratings yet
Ch. 2 Source Coding-Ppt1 PDF
59 pages
All Coding
No ratings yet
All Coding
52 pages
Syllabus CS6505 Gatech
No ratings yet
Syllabus CS6505 Gatech
4 pages
Lec2 - Data Compression PDF
No ratings yet
Lec2 - Data Compression PDF
9 pages
Unit 1
No ratings yet
Unit 1
94 pages
Source Coding and Channel Coding For Mobile Multimedia Communication
No ratings yet
Source Coding and Channel Coding For Mobile Multimedia Communication
19 pages
Digital Communications C M. Skoglund Digital Communications C M. Skoglund
No ratings yet
Digital Communications C M. Skoglund Digital Communications C M. Skoglund
5 pages
CIE 115 Lesson 4
No ratings yet
CIE 115 Lesson 4
5 pages
CH 11
No ratings yet
CH 11
36 pages
CE Notes
No ratings yet
CE Notes
32 pages
Distributed Systems - Agreement Protocol
No ratings yet
Distributed Systems - Agreement Protocol
17 pages
Information Theory: Dr. Muhammad Imran Farid
No ratings yet
Information Theory: Dr. Muhammad Imran Farid
32 pages
Ece07 CT Cia 1
No ratings yet
Ece07 CT Cia 1
2 pages
Group Presentation Digital Communication Systems
No ratings yet
Group Presentation Digital Communication Systems
29 pages
PS QP IC X Math Factorisation
100% (1)
PS QP IC X Math Factorisation
2 pages
Digital Communication Chapter 3
No ratings yet
Digital Communication Chapter 3
37 pages
Nov Dec 2022
No ratings yet
Nov Dec 2022
3 pages
Unit 1
100% (2)
Unit 1
45 pages
Linear Regression Mca Lab - Jupyter Notebook
No ratings yet
Linear Regression Mca Lab - Jupyter Notebook
2 pages
Extra LPR281 Questions
No ratings yet
Extra LPR281 Questions
4 pages
QB Unit 1 DC
No ratings yet
QB Unit 1 DC
8 pages
Unit I Information Theory & Coding Techniques P I
No ratings yet
Unit I Information Theory & Coding Techniques P I
48 pages
Entropy 3
No ratings yet
Entropy 3
10 pages
Digital Communication Chapter 3
No ratings yet
Digital Communication Chapter 3
37 pages
Chapter 5 - Information Theory
No ratings yet
Chapter 5 - Information Theory
22 pages
Information Theory
No ratings yet
Information Theory
21 pages
Greedy Algorithm
No ratings yet
Greedy Algorithm
13 pages
Mobile Communicaton Engineering: Review On Fundamental Limits On Communications
No ratings yet
Mobile Communicaton Engineering: Review On Fundamental Limits On Communications
31 pages
Information Theory and Coding 2marks
No ratings yet
Information Theory and Coding 2marks
12 pages
Channel Coding Theorem
No ratings yet
Channel Coding Theorem
23 pages
Source Coding: Source Encoder Channel Encoder Digital Source Source Entropy Symbols Binary Sequence Modulator
No ratings yet
Source Coding: Source Encoder Channel Encoder Digital Source Source Entropy Symbols Binary Sequence Modulator
18 pages
Geo AI
No ratings yet
Geo AI
50 pages
Unit - 1
No ratings yet
Unit - 1
59 pages
Digital Communication Intro2
No ratings yet
Digital Communication Intro2
2 pages
Unit 5 - Part-Ii
No ratings yet
Unit 5 - Part-Ii
41 pages
ICT
No ratings yet
ICT
10 pages
Algorithms Illuminated Part 3 Greedy Algorithms and Dynamic Programming Tim Roughgarden All Chapters Instant Download
100% (2)
Algorithms Illuminated Part 3 Greedy Algorithms and Dynamic Programming Tim Roughgarden All Chapters Instant Download
65 pages
Question Bank: Information Coding Techniques
No ratings yet
Question Bank: Information Coding Techniques
10 pages
Lecture 7 Source Coding 2024
No ratings yet
Lecture 7 Source Coding 2024
28 pages
PT1 Itc QB
No ratings yet
PT1 Itc QB
12 pages
Ge 203 Signal and Transform
No ratings yet
Ge 203 Signal and Transform
3 pages
ETN3046 Chapter 6
No ratings yet
ETN3046 Chapter 6
31 pages
Test Information and Coding Theory
No ratings yet
Test Information and Coding Theory
3 pages
( ) 5-1. Routing Protocols 2
No ratings yet
( ) 5-1. Routing Protocols 2
1 page
Introduction To ITC
No ratings yet
Introduction To ITC
24 pages
Bmi 401-Design and Analysis of Algorithms Course Outline
No ratings yet
Bmi 401-Design and Analysis of Algorithms Course Outline
4 pages
Source Coding
No ratings yet
Source Coding
29 pages
DC-PPT 5
No ratings yet
DC-PPT 5
44 pages
Unit 1 INFORMATION ENTROPY FUNDAMENTALS
No ratings yet
Unit 1 INFORMATION ENTROPY FUNDAMENTALS
13 pages
Ec8501-Digital Communication Question Bank Two Marks With Answer
No ratings yet
Ec8501-Digital Communication Question Bank Two Marks With Answer
28 pages
Perceptual Computing: Fundamentals and Applications
From Everand
Perceptual Computing: Fundamentals and Applications
Fouad Sabry
No ratings yet
Algorithmic Probability: Fundamentals and Applications
From Everand
Algorithmic Probability: Fundamentals and Applications
Fouad Sabry
No ratings yet
Simulation of Digital Communication Systems Using Matlab
From Everand
Simulation of Digital Communication Systems Using Matlab
Mathuranathan Viswanathan
3.5/5 (22)
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Error-Correction on Non-Standard Communication Channels
From Everand
Error-Correction on Non-Standard Communication Channels
Edward A. Ratzer
No ratings yet

Module-3 Information Theory: Entropy Source-Coding Theorem

Uploaded by

Module-3 Information Theory: Entropy Source-Coding Theorem

Uploaded by

Module-3

Lossless Data Compression Algorithms

Huffman Coding problems

it emits an output symbol Y from an alphabet 𝒴.

𝒳= {x0, x1,… Xj-1}

𝒴={y0, y1,… xK-1}

Properties of Mutual Information

 Accordingly, we make the following statement:

Information Capacity Law

You might also like