0% found this document useful (0 votes)

10 views50 pages

5CS3 ITC Unit II @zammers

Uploaded by

sanjanagoswami1815

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views50 pages

5CS3 ITC Unit II @zammers

Uploaded by

sanjanagoswami1815

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 50

5CS3-01

5CS3
Information Theory & Coding
Unit – 2
Source Coding For Data Compaction
Contents
Prefix code
Huffman code
Shanon-Fane code
Lempel-Ziv coding
channel capacity
Channel coding theorem
Shannon limit
Prefix Codes
A prefix code is a variable length code in which no codeword is a prefix
of another one

0 1
e.g a = 0, b = 100, c = 101, d = 11
1
Can be viewed as a binary trie a 0

0 1 d

b c
Huffman code
The Huffman coding procedure finds the optimum (least rate
uniquely decodable, variable length entropy code associated with
a set of events given their probabilities of occurrence.

Xi X1 X2 X3 X4 X5 X6

P(Xi) 0.30 0.25 0.20 0.12 0.08 0.05

Huffman coding
Xi P(Xi)
X1 0.30 0.30 0.30 0.45 0.55
00 00 00 1 0
X2 0.25 0.25 0.25 0.30 0.45
01 01 01 00 1
X3 0.20 0.20 0.25 0.25
11 11 10 01
X4 0.12 0.13 0.20
101 100 11
X5 0.08 0.12
1000 101
X6 0.05
1001
Xi Code Codeword length
X1 00 2
X2 01 2
X3 11 2
X4 101 3
X5 1000 4
x6 1001 4
Xi Probability Code Code length
X1 0.30 00 2
X2 0.25 01 2
X3 0.20 11 2
X4 0.12 101 3
X5 0.08 1000 4
x6 0.05 1001 4

gth of Information = 0.302 + 0.252 + 0.202 + 0.123 + 0.08*4 + 0

= 0.6 + 0.5 + 0.4 + 0.36 + 0.32 + 0.2 = 2.38 bit/sym
ngth of Information = 0.30*2 + 0.25*2 + 0.20*2 + 0.12*3 + 0.08*4 + 0.0
= 0.6 + 0.5 + 0.4 + 0.36 + 0.32 + 0.2 = 2.38 bit/sym

X) = 2.36 bit/symbol

iciency = H(X) / L = 2.36 / 2.38 = 0.991 = 99.1%

dundancy = 1 – Code Efficiency = 1 – 0.991 = 0.009 = 0.9%

f Probability of five variable is given then calculate
the length of information using Huffman Coding

P(Xi) 0.2 0.4 0.2 0.1 0.1

X(i) P(Xi)
X1 0.4 0.4 0.4 0.6
1 1 1 0
X2 0.2 0.2 0.4 0.4
01 01 00 1
X3 0.2 0.2 0.2
000 000 01
X4 0.1 0.2
0010 001
X5 0.1
0011
Xi Probability Code Code length
X1 0.4 1 1
X2 0.2 01 2
X3 0.2 000 3
X4 0.1 0010 4
X5 0.1 0011 4
ngth of Information = 0.4*1 + 0.2*2 + 0.2*3 + 0.1*4 + 0.1*4
= 0.4 + 0.4 + 0.6 + 0.4 + 0.4 = 2.2 bit/symbol
) = 1.98 bit/symbol
e Efficiency = H(X) / L = 1.98 / 2.2 = 0.9 = 90%
undancy = 1 – Code Efficiency = 1 – 0.9 = 0.1 = 10%
p(a) = .1, p(b) = .2, p(c ) = .2, p(d) = .5
Variable Probability

a 0.5 0.5 0.5

0 0 0
b 0.2 0.3 0.5
11 10 1
c 0.2 0.2
100 11
d 0.1
101
Variable Probability Code Code
Length
a 0.5 0 1
b 0.2 11 2
c 0.2 100 3
d 0.1 101 3

ngth of Information = 0.5 * 1 + 0.22 + 0.23 + 0.1 *3

= 0.5 + 0.4 + 0.6 + 0.3 = 1.8 bit/ symbol
Shanon-Fane code
Shannon Fano Algorithm is an entropy encoding technique fo
lossless data compression of multimedia.
multimedia

Named after Claude Shannon and Robert Fano, it assigns a code

to each symbol based on their probabilities of occurrence.

It is a variable length encoding scheme, that is, the codes

assigned to the symbols will be of varying length.
HOW DOES IT WORK?
The steps of the algorithm are as follows:

1. Create a list of probabilities or frequency counts for the given set of symbols so that th
relative frequency of occurrence of each symbol is known.

2. Sort the list of symbols in decreasing order of probability, the most probable ones to th
left and least probable to the right.

3. Split the list into two parts, with the total probability of both the parts being as close to
each other as possible.

4. Assign the value 0 to the left part and 1 to the right part.

5. Repeat the steps 3 and 4 for each part, until all the symbols are split into individua
subgroups.
on:
x) be the probability of occurrence of symbol x:

n arranging the symbols in decreasing order of probability:

P(D) + P(B) = 0.30 + 0.2 = 0.58

P(A) + P(C) + P(E) = 0.22 + 0.15 + 0.05 = 0.42

n {D, B} group,

0.30 and P(B) = 0.28

means that P(D)~P(B),, so divide {D, B} into {D} and {B} and assign 0 to D and 1 to B.
{A, C, E} group,
P(A) = 0.22 and P(C) + P(E) = 0.20
So the group is divided into
{A} and {C, E}
and they are assigned values 0 and 1 respectively.

{C, E} group,
P(C) = 0.15 and P(E) = 0.05
So divide them into {C} and {E} and assign 0 to {C} and 1 to {E}
Note: The sp
C E
0.15 0.05 is now stopp
each symbo
separated now
Variable A B C D E

Probabili 0.22 0.28 0.15 0.30 0.05

ty
Code 10 01 110 00 111

Code 2 2 3 2 3
Length

L = 0.222 + 0.282 + 0.153 + 0.302 + 0.05*3 = 2.2 bit/ symbol

Shannon-Fano Encoding
Xi P(Xi)

X1 0.30 0 0 00

X2 0.25 0 1 01

X3 0.20 1 0 10

X4 0.12 1 1 0 110

X5 0.08 1 1 1 0 1110

X6 0.05 1 1 1 1 1111
Shannon-Fano Encoding
H(X) = 2.36 bit/symbol
L = 2.38 bit/symbol
Efficiency = 0.99
Channel Capacity
Channel capacity, in electrical engineering, computer science
and information theory, is the tight upper bound on the rate a
which information can be reliably transmitted ove
a communication channel.

The basic mathematical model for a communication system is

the following:
Channel Capacity
According to channel capacity equation,
C = B log2(1 + S/N),
C-capacity,
B-bandwidth of channel,
S-signal power,
N-noise power,
when B -> infinity (read B 'tends to‘ infinity), capacity saturates
to 1.44S/N.
Channel Capacity
S/N is the signal to noise power ratio (SNR). SNR generally
measured in dB using the formula,

(S/N) dB = 10 log (Signal Power / Noise Power)

(SNR) dB = 10 log (Signal Power / Noise Power)
Consider a voice-grade line for which B=3100Hz, SNR=30dB
Calculate the channel Capacity.
Capacity
Given,
SNR = 30dB
B = 3100Hz
(SNR) dB = 10 log (Signal Power / Noise Power)
30 = 10 log (S/N)
(S/N) = 1000
C = B log2(1 + S/N)
= 3100 * log2(1 + 1000)
= 3100 * log2(1001)
C = 30,894 bps = 30.894 kbps
Shannon-Hartley
Hartley Theorem
channel capacity , the tightest upper bound on information rate (excluding
error correcting codes) of arbitrarily low bit error rate data that can be sen
with a given average signal power S through an additive white Gaussian
noise channel of power N, is:

C is the channel capacity in bits per second

B is the bandwidth of the channel in hertz
S is the total received signal power over bandwidth, in watts
N is the total noise or interference power over bandwidth, in watts
S/N is the signal-to-noise ratio (SNR) expressed as a linear power ratio (not
as logarithmic decibels).
Shannon-Hartley
Hartley Theorem
Consider the operation of a modem on an ordinary
telephone line. The SNR is usually about 1000. The
bandwidth is 3.4 KHz.

Therefore:
C = 3400 * log2(1 + 1000)
= (3400)(9.97)
≈34 kbps
Shannon-Hartley
Hartley Theorem
We cannot prove the theorem, but can partially justify it as follows:
suppose the received signal is accompanied by noise with a RMS
voltage of σ , and that the signal has been quantised with levels
separated by a = λσ .
If λ is chosen sufficiently large, we may expect to be able to recognize
the signal level with an acceptable probability of error.
Suppose further that each message is to be represented by one voltage
level.
If there are to be M possible messages, then there must be M levels
The average signal power is then
Shannon-Hartley
Hartley Theorem
Shannon-Hartley
Hartley Theorem
where N = σ 2 is the noise power.
power If each message is equally
likely, then each carries an equal amount of information.
Shannon-Hartley
Hartley Theorem
To find the information rate, we need to estimate how many
messages can be carried per unit time by a signal on the channel.
 Since the discussion is heuristic, we note that the response of an
ideal LPF of bandwidth B to a unit step has a 10–90 percent rise
time of τ = 0.44/B.
We estimate therefore that with T = 0.5/B ≈ τ we should be able
to reliably estimate the level.
The message rate is then
Shannon-Hartley
Hartley Theorem
Shannon-Hartley
Hartley Theorem
This is equivalent to the Shannon-Hartley
Shannon theorem with λ = 3.5.

 Note that this discussion has estimated the rate at which

information can be transmitted with reasonably small error, the
Shannon-Hartley theorem indicates that with sufficiently
advanced coding techniques transmission at channel capacity can
occur with arbitrarily small error.
Shannon-Hartley
Hartley Theorem
The expression of the channel capacity of the Gaussian
channel makes intuitive sense::
As the bandwidth of the channel increases, it is possible to
make faster changes in the information signal, thereby
increasing the information rate.
rate
As S/N increases, one can increase the information rate while
still preventing errors due to noise.
noise
 For no noise, S/N → ∞ and an infinite information rate is
possible irrespective of bandwidth.
bandwidth
Shannon-Hartley
Hartley Theorem
Thus we may trade off bandwidth for SNR. For example, if S/N
= 7 and B = 4kHz, then the channel capacity is C = 12 × 103
bits/s.
If the SNR increases to S/N = 15 and B is decreased to 3kHz, the
channel capacity remains the same.
same
However, as B → ∞, the channel capacity does not become
infinite since, with an increase in bandwidth, the noise power also
increases.
If the noise power spectral density is η/2, then the total noise
power is N = ηB, so the Shannon-Hartley
Shannon law becomes
Shannon-Hartley
Hartley Theorem
Shannon-Hartley
Hartley Theorem
This gives the maximum information transmission rate possible
for a system of given power but no bandwidth limitations.

The power spectral density can be specified in terms o

equivalent noise temperature by η = kTeq.

There are literally dozens of coding techniques, entire textbooks

are devoted to the subject, and it is an active research subject.

Obviously all obey the Shannon-Hartley

Shannon theorem
Some general characteristics of the Gaussian channel can b
demonstrated. Suppose we are sending binary digits at
transmission rate equal to the channel capacity: R = C.
If the average signal power is S, then the average energy per b
is Eb = S/C, since the bit duration is 1/C seconds.
This relationship is as follows:
Shannon limit
The asymptote is at Eb/η = −1.59
59dB, so below this value there is
no error-free communication at any information rate. This is
called the Shannon limit.
Lampel Ziv Coding
The Lempel-Ziv algorithm is a variable-to-fixed
variable length code.
Basically, there are two versions of the algorithm LZ77 and
LZ78 are the two lossless data compression algorithms published
by Abraham Lempel and Jacob Ziv in & They are also known as
LZ1 and LZ2 respectively.
These two algorithms form the basis for many variations
including LZW, LZSS, LZMA and others.
Besides their academic influence, these algorithms formed the
basis of several ubiquitous compression schemes, including GIF
and the DEFLATE algorithm used in PNG.
Lampel Ziv Coding
AABABBBABAABABBBABBABBA

A AB ABB B ABA ABAB BB ABBA BBA

1 2 3 4 5 6 7 8 9

on 1 2 3 4 5 6 7 8

nce A AB ABB B ABA ABAB BB ABBA

ition 1 2 3 4 5 6 7 8

uence A AB ABB B ABA ABAB BB ABBA

erical ФA 1B 2B ФB 2A 5B 4B 3A
ep.

ode 000 11 101 001 100 1011 1001 0110

00111010011001011100101101110
Lampel Ziv Coding

1 0 10 11 01 101 010 1010

1, 0, 10, 11, 01, 101, 010, 1010

Directory 001 010 011 100 101 110 111 1000
Location
Content 1 0 10 11 01 101 010 1010

Code 000 1 000 0 001 0 001 1 010 1 011 1 101 0 110 0

DC Full Handwritten Notes @vtudeveloper - in
No ratings yet
DC Full Handwritten Notes @vtudeveloper - in
305 pages
Information Theory Coding 6 Sem Ec Notes
91% (22)
Information Theory Coding 6 Sem Ec Notes
174 pages
1
No ratings yet
1
86 pages
Information and Coding Theory
No ratings yet
Information and Coding Theory
177 pages
Source Coding
No ratings yet
Source Coding
35 pages
Ch. 2 Source Coding-Ppt1 PDF
No ratings yet
Ch. 2 Source Coding-Ppt1 PDF
59 pages
Materi Source Coding
No ratings yet
Materi Source Coding
39 pages
Data Compression and Source Coding
No ratings yet
Data Compression and Source Coding
64 pages
Lossless Compression: Huffman Coding: Mikita Gandhi Assistant Professor Adit
No ratings yet
Lossless Compression: Huffman Coding: Mikita Gandhi Assistant Professor Adit
39 pages
Noise, Information Theory, and Entropy: CS414 - Spring 2007
No ratings yet
Noise, Information Theory, and Entropy: CS414 - Spring 2007
44 pages
Information Theory & Coding Techniques-DCom
No ratings yet
Information Theory & Coding Techniques-DCom
28 pages
Concepts & Information Theory
No ratings yet
Concepts & Information Theory
68 pages
EC523 Advanced Comm. Lecture 1
No ratings yet
EC523 Advanced Comm. Lecture 1
52 pages
DC Lecture Slides 1 - Information Theory
No ratings yet
DC Lecture Slides 1 - Information Theory
22 pages
Chapter Five Lossless Compression
No ratings yet
Chapter Five Lossless Compression
49 pages
cp467 12 Lecture14 Compression1
No ratings yet
cp467 12 Lecture14 Compression1
146 pages
Lecture 7 Source Coding 2024
No ratings yet
Lecture 7 Source Coding 2024
28 pages
Noise, Information Theory, and Entropy
No ratings yet
Noise, Information Theory, and Entropy
34 pages
Unit 2
No ratings yet
Unit 2
30 pages
Unit 5 - Part-Ii
No ratings yet
Unit 5 - Part-Ii
41 pages
ETN3046 Chapter 6
No ratings yet
ETN3046 Chapter 6
31 pages
Source 515 A
No ratings yet
Source 515 A
80 pages
Source Coding
No ratings yet
Source Coding
18 pages
Inf The Rev
No ratings yet
Inf The Rev
19 pages
Revision of Lecture 1: Q Bits R R Q Q (Bits/symbol) I (M P Log R R R) ? M, P
No ratings yet
Revision of Lecture 1: Q Bits R R Q Q (Bits/symbol) I (M P Log R R R) ? M, P
18 pages
97351
No ratings yet
97351
17 pages
Information Coding Techniques
0% (2)
Information Coding Techniques
374 pages
Mesleki Yeterlilik
No ratings yet
Mesleki Yeterlilik
106 pages
Information Coding Techniques
No ratings yet
Information Coding Techniques
42 pages
Source Coding
No ratings yet
Source Coding
29 pages
Topic 2 Information and Coding Theory
No ratings yet
Topic 2 Information and Coding Theory
68 pages
Cp5151 Advanced Data Structures and Algorithms
0% (1)
Cp5151 Advanced Data Structures and Algorithms
4 pages
DC-PPT 5
No ratings yet
DC-PPT 5
44 pages
Cha 02
No ratings yet
Cha 02
45 pages
Information and Digital Transmission: Haykin Chapter 9 Carlson Chapter 16
No ratings yet
Information and Digital Transmission: Haykin Chapter 9 Carlson Chapter 16
27 pages
QB Unit 1 DC
No ratings yet
QB Unit 1 DC
8 pages
PTSP VI Part 2
No ratings yet
PTSP VI Part 2
44 pages
Source Coding Shannon Fano Coding
No ratings yet
Source Coding Shannon Fano Coding
24 pages
3-1-Lossless Compression
No ratings yet
3-1-Lossless Compression
10 pages
All Coding
No ratings yet
All Coding
52 pages
3 Source Coding
No ratings yet
3 Source Coding
31 pages
CH 6
No ratings yet
CH 6
21 pages
Elements of Encoding
No ratings yet
Elements of Encoding
16 pages
Unit 1 INFORMATION ENTROPY FUNDAMENTALS
No ratings yet
Unit 1 INFORMATION ENTROPY FUNDAMENTALS
13 pages
Data Compression (Pt2)
No ratings yet
Data Compression (Pt2)
22 pages
Unit I Information Theory & Coding Techniques P I
No ratings yet
Unit I Information Theory & Coding Techniques P I
48 pages
Group Presentation Digital Communication Systems
No ratings yet
Group Presentation Digital Communication Systems
29 pages
Channel Coding Theorem
No ratings yet
Channel Coding Theorem
23 pages
UNIT III Network Layer
No ratings yet
UNIT III Network Layer
51 pages
Basic Concepts of Encoding
No ratings yet
Basic Concepts of Encoding
34 pages
Lossless Compression: Lesson 1
No ratings yet
Lossless Compression: Lesson 1
10 pages
Source Coding: Source Encoder Channel Encoder Digital Source Source Entropy Symbols Binary Sequence Modulator
No ratings yet
Source Coding: Source Encoder Channel Encoder Digital Source Source Entropy Symbols Binary Sequence Modulator
18 pages
ICT (Source Encoding & Channel Encoding)
No ratings yet
ICT (Source Encoding & Channel Encoding)
15 pages
ICT
No ratings yet
ICT
10 pages
Lecture 2 28 August, 2015: 2.1 An Example of Data Compression
No ratings yet
Lecture 2 28 August, 2015: 2.1 An Example of Data Compression
7 pages
Question Bank: Information Coding Techniques
No ratings yet
Question Bank: Information Coding Techniques
10 pages
Information Theory and Coding 2marks
No ratings yet
Information Theory and Coding 2marks
12 pages
Information Theory and Coding Solutions
No ratings yet
Information Theory and Coding Solutions
2 pages
Bda Unit 5 Notes
No ratings yet
Bda Unit 5 Notes
23 pages
IZO Cloud Platform & Services Sales Playbook
No ratings yet
IZO Cloud Platform & Services Sales Playbook
55 pages
CCNA 200 120 Lab Manual
No ratings yet
CCNA 200 120 Lab Manual
75 pages
INT252
No ratings yet
INT252
2 pages
Unit 4 FIoT Notes
No ratings yet
Unit 4 FIoT Notes
23 pages
Hollysys: LK Programmable Logic Controller
No ratings yet
Hollysys: LK Programmable Logic Controller
17 pages
Cyber Security in Power System
No ratings yet
Cyber Security in Power System
5 pages
History of Neural Networks
No ratings yet
History of Neural Networks
4 pages
Computer Graphics (II)
No ratings yet
Computer Graphics (II)
11 pages
Online Shopping System - Full Report
No ratings yet
Online Shopping System - Full Report
52 pages
Paper 2 (Practical Programming Project)
No ratings yet
Paper 2 (Practical Programming Project)
8 pages
Python Programming Internship Report YBI
No ratings yet
Python Programming Internship Report YBI
5 pages
Quiz 1 - Final - Solution
No ratings yet
Quiz 1 - Final - Solution
6 pages
Helpcard Online-Update Xentry Lite 1-0 en
No ratings yet
Helpcard Online-Update Xentry Lite 1-0 en
4 pages
UNIT 5 Part1
No ratings yet
UNIT 5 Part1
101 pages
Test NG
No ratings yet
Test NG
42 pages
Iot Ia 1
No ratings yet
Iot Ia 1
37 pages
R Assignment
No ratings yet
R Assignment
6 pages
Lab Manual 19
No ratings yet
Lab Manual 19
2 pages
400 Gbe Data Center Switch Bare-Metal Hardware: Datasheet
No ratings yet
400 Gbe Data Center Switch Bare-Metal Hardware: Datasheet
4 pages
Settings Provider
No ratings yet
Settings Provider
179 pages
Midex2023 Solutions
No ratings yet
Midex2023 Solutions
8 pages
Danfoss Drive Software Upgrader As A Command Line Tool
No ratings yet
Danfoss Drive Software Upgrader As A Command Line Tool
48 pages
Datasheet Brahma (2023)
No ratings yet
Datasheet Brahma (2023)
8 pages
Instruction Set
No ratings yet
Instruction Set
1 page
Programming Language: Core Java
No ratings yet
Programming Language: Core Java
6 pages
Lab Manual 09
No ratings yet
Lab Manual 09
6 pages
Skilled in Agile and Waterfall Software Development Life Cycle Methodologies
No ratings yet
Skilled in Agile and Waterfall Software Development Life Cycle Methodologies
3 pages
Principles of Digital Electronics
From Everand
Principles of Digital Electronics
Sapana Rane
No ratings yet
Error-Correction on Non-Standard Communication Channels
From Everand
Error-Correction on Non-Standard Communication Channels
Edward A. Ratzer
No ratings yet
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
From Everand
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Gérard Blanchet
3/5 (1)

5CS3 ITC Unit II @zammers

Uploaded by

5CS3 ITC Unit II @zammers

Uploaded by

5CS3-01

P(Xi) 0.30 0.25 0.20 0.12 0.08 0.05

gth of Information = 0.30*2 + 0.25*2 + 0.20*2 + 0.12*3 + 0.08*4 + 0

iciency = H(X) / L = 2.36 / 2.38 = 0.991 = 99.1%

dundancy = 1 – Code Efficiency = 1 – 0.991 = 0.009 = 0.9%

P(Xi) 0.2 0.4 0.2 0.1 0.1

a 0.5 0.5 0.5

ngth of Information = 0.5 * 1 + 0.2*2 + 0.2*3 + 0.1 *3

Named after Claude Shannon and Robert Fano, it assigns a code

It is a variable length encoding scheme, that is, the codes

n arranging the symbols in decreasing order of probability:

P(D) + P(B) = 0.30 + 0.2 = 0.58

P(A) + P(C) + P(E) = 0.22 + 0.15 + 0.05 = 0.42

0.30 and P(B) = 0.28

Probabili 0.22 0.28 0.15 0.30 0.05

L = 0.22*2 + 0.28*2 + 0.15*3 + 0.30*2 + 0.05*3 = 2.2 bit/ symbol

The basic mathematical model for a communication system is

(S/N) dB = 10 log (Signal Power / Noise Power)

C is the channel capacity in bits per second

 Note that this discussion has estimated the rate at which

The power spectral density can be specified in terms o

There are literally dozens of coding techniques, entire textbooks

Obviously all obey the Shannon-Hartley

A AB ABB B ABA ABAB BB ABBA BBA

nce A AB ABB B ABA ABAB BB ABBA

uence A AB ABB B ABA ABAB BB ABBA

ode 000 11 101 001 100 1011 1001 0110

1 0 10 11 01 101 010 1010

1, 0, 10, 11, 01, 101, 010, 1010

Code 000 1 000 0 001 0 001 1 010 1 011 1 101 0 110 0

You might also like

gth of Information = 0.302 + 0.252 + 0.202 + 0.123 + 0.08*4 + 0

ngth of Information = 0.5 * 1 + 0.22 + 0.23 + 0.1 *3

L = 0.222 + 0.282 + 0.153 + 0.302 + 0.05*3 = 2.2 bit/ symbol