0% found this document useful (0 votes)

116 views25 pages

3.source Coding Data Compression

The document discusses source coding and data compression techniques. It provides an overview of information theory concepts like entropy, Shannon's source coding theorem, and components of a communication system. Specific compression algorithms covered include Run Length Encoding, Huffman coding, Shannon-Fano coding, LZW coding, and JPEG. Error detection, correction, cryptography, steganography, and modulation techniques are also summarized.

Uploaded by

Zubair Minhas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

116 views25 pages

3.source Coding Data Compression

Uploaded by

Zubair Minhas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 25

NUST

NUST School
School of
of Electrical
Electrical Engineering
Engineering &
& Computer
Computer Sciences
Sciences (SEECS)
(SEECS)
Department
Department of
of Communication
Communication Systems
Systems Engineering
Engineering

CSE-434: Systems Engineering

Source Coding and Data Compression

Information Theory
• The mathematical theory of communication
involving the quantification of information
• Deals with the transmission of information over
a noisy channel
– Source coding theorem
– Noisy channel coding theorem
• Does not concern with the importance of a
message
Shannon’s Generic Communication System

Generic Communication System, from Chapter 2 of K.V. Prasad.

Components of a Communication System
• Information source produces the symbols
• Source encoder converts the symbols into a data
stream
– Source encoding reduces the redundancy
– Can be divided into lossless encoding techniques and
lossy encoding techniques
• Channel encoder introduces redundancy for error
detection or error correction at receiver
• Modulator transforms the signal so that it can be
transmitted through the medium
Entropy
• Shannon’s formula to measure information of
the source, known as Entropy
H  log 2 N bits / symbol
for equally likely symbols

H   P i  log 2 P i  bits / symbol

if ith symbol has probability P(i)
Example 1
• Entropy of a source producing English alphabet
with each symbol being equally likely:
H  log 2 26  4.7 bits / symbol
• Entropy of a source producing 4 symbols with
probability {0.5, 0.25, 0.125, 0.125} respectively

H    P (i ) log 2 P (i )
H  1.75 bits / symbol
Exercise 1
• Calculate the entropy of a source that produces
4 symbols with probability 1/8 and 2 symbols
with probability 1/4

• Answer: 2.5 bits/symbol

Data Compression
• The process of encoding information using
fewer bits than the original message
• Lossless data compression
– e.g. RLE, Huffman, Shannon-Fano, LZW, LZ77
• Lossy data compression
– e.g. JPEG, MPEG, AMR, AC3
Run Length Encoding
• Sequence of repeating characters are replaced
by their count
• Useful when input text has long repeating
sequences
• Special character inserted to identify
compression
Example 2:
Input stream: WHOOOOOODUNNNNNIT!!!
Special char: \
Output stream: WH\6ODU\5NIT\3!
Huffman Compression
• Variable length lossless data compression
technique
• More frequently occurring characters are given
shorter code
Example 3:
• Input stream: WHOOOOOODUNNNNNIT!!!
Character W H O D U N I T !
Frequency 1 1 6 1 1 5 1 1 3

Sorted
Character W H D U I T ! N O
Frequency 1 1 1 1 1 1 3 5 6
Huffman Compression
Character D U I T (W+H) ! N O
Frequency 1 1 1 1 2 3 5 6

Character I T (D+U) (W+H) ! N O

Frequency 1 1 2 2 3 5 6

Character (I+T) (D+U) (W+H) ! N O

Frequency 2 2 2 3 5 6

Character (W+H) ! (I+T)+(D+U) N O

Frequency 2 3 4 5 6

Character ((I+T)+(D+U)) ((W+H)+!) N O

Frequency 4 5 5 6
Huffman Compression
Character (((I+T)+(D+U))+((W+H)+!)) N+O
Frequency 9 11

Huffman Codes Huffman Tree

Char Code
O 11
N 10
! 011
H 0101
W 0100
U 0011
D 0010
T 0001
I 0000
Exercise 2
• Find the entropy of the source in the previous
example
• Find the average number of bits/symbol for the
Huffman code derived in the previous example
– Hint: Use
L   L(i ) P (i )
where L(i) is the length of code assigned to symbol ‘i’
• Discuss why Huffman code will be better than a
fixed length code
Shannon-Fano
Algorithm:
1. Determine the probability of each symbol in the source
text
2. Sort the symbols in decreasing probability order
3. Divide the set of symbols into two parts such that each
part has an approximately equal probability
4. The symbols in the first part are coded with the bit zero
and the symbols in the second part with the bit one
5. Repeat steps 3 and 4 until each sub-division contains
exactly one symbol
Shannon-Fano Example
Example 4:
• Input stream: WHOOOOOODUNNNNNIT!!!
Character W H O D U N I T !
Probability 1/20 1/20 6/20 1/20 1/20 5/20 1/20 1/20 3/20
Shannon-Fano Example
Shannon-Fano Example
Shannon-Fano Codes Shannon-Fano Tree

Char Code
O 00
N 01
! 100
W 101
H 1100
D 1101
U 1110
I 11110
T 11111

Exercise 3: Find the average number of bits/symbol for this code

LZW Coding Example
• Lossless data compression algorithm
• Created by Abraham Lempel, Jacob Ziv and Terry
Welch
• Do not need to know the probability of symbol
occurence
Example 5:
• Input Stream: TOBEORNOTTOBEORTOBEORNOT#
• Symbols: A-Z, #
• 5 bits required for fixed length code
• Length of message: 25 x 5 = 125 bits
LZW Coding Example

Compressed Message
= 97 bits

Ref: http://en.wikipedia.org/wiki/LZW
JPEG
• "Joint Photographic Expert Group" – an
international standard in 1992
• JPEG is a commonly used method of
compression for photographic images
• Works with both color and grey-scale images
• JPEG file can be encoded in several ways e.g.,
JFIF (JPEG File Interchange Format)
JPEG

Loss of information
Coding Techniques
• Text
– ASCII, Extended ASCII, Morse, RLE, Huffman,
Adaptive Huffman, Shannon-Fano, LZ77, LZ78, LZW,
CTW, BWT, DMC
• Audio
– A-law, -law, G.7xx (ITU-T suite of standards)
Error Detection and Correction
• Ability to detect transmission errors in the
received data and to reconstruct the original
data
• Error detection techniques
– e.g. Parity, Checksum, CRC, Hamming codes, Hash
functions
• Error correction techniques
– ARQ (Stop-and-Wait, Go-back-N, Selective Repeat)
– FEC (Hamming, Reed-Solomon, Golay)
Cryptography and Steganography
• Cryptography is the study of hiding the
information
– Substitution ciphers
– Transposition ciphers
– One-time pads
– Symmetric and public key algorithms
• Steganography is the study of hiding the
existence of information
Modulation
• The addition of information to a signal carrier
– Digital data, digital signal (data encoding)
– Digital data, analog signal(ASK, FSK, PSK)
– Analog data, analog signal (AM, FM, PM)
– Analog data, digital signal (PCM, DM)
• Reasons
– Compatibility of signal with transmission medium
– Frequency division multiplexing

CV MAJOR Aftab Hussain
100% (3)
CV MAJOR Aftab Hussain
2 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Linear Model of Communication
100% (1)
Linear Model of Communication
25 pages
Module IV
No ratings yet
Module IV
37 pages
Mad Unit 3-Jntuworld
No ratings yet
Mad Unit 3-Jntuworld
53 pages
Compressor Principles
No ratings yet
Compressor Principles
32 pages
Compression For Sending and Storing Information: Text, Audio, Images, Videos
No ratings yet
Compression For Sending and Storing Information: Text, Audio, Images, Videos
28 pages
Image Compression
100% (1)
Image Compression
38 pages
Ec8093-Digital Image Processing: Dr.K.Kalaivani Associate Professor Dept. of EIE Easwari Engineering College
No ratings yet
Ec8093-Digital Image Processing: Dr.K.Kalaivani Associate Professor Dept. of EIE Easwari Engineering College
37 pages
Information Theory: Mohamed Hamada
No ratings yet
Information Theory: Mohamed Hamada
44 pages
Analog & Digital Communication Presentation On Data Compression
No ratings yet
Analog & Digital Communication Presentation On Data Compression
31 pages
Chapter 3 Multimedia Data Compression
No ratings yet
Chapter 3 Multimedia Data Compression
21 pages
Chapter Presentation
No ratings yet
Chapter Presentation
57 pages
Chapter 7
No ratings yet
Chapter 7
70 pages
Source Coding: Source Encoder Channel Encoder Digital Source Source Entropy Symbols Binary Sequence Modulator
No ratings yet
Source Coding: Source Encoder Channel Encoder Digital Source Source Entropy Symbols Binary Sequence Modulator
18 pages
Chapter 4 - Introduction To Source Coding
No ratings yet
Chapter 4 - Introduction To Source Coding
72 pages
Image and Video Compression: Lecture 12, April 27, 2009 Lexing Xie
No ratings yet
Image and Video Compression: Lecture 12, April 27, 2009 Lexing Xie
77 pages
CH 6
No ratings yet
CH 6
21 pages
Lossless Compression: Lesson 1
No ratings yet
Lossless Compression: Lesson 1
10 pages
Mesleki Yeterlilik
No ratings yet
Mesleki Yeterlilik
106 pages
Lecture 3-Huffman Coding
No ratings yet
Lecture 3-Huffman Coding
30 pages
3 Source Coding
No ratings yet
3 Source Coding
31 pages
Data Compression (Pt2)
No ratings yet
Data Compression (Pt2)
22 pages
Unit3 Ece MMC 6th Sem
No ratings yet
Unit3 Ece MMC 6th Sem
96 pages
Multimedia Systems Chapter 7
No ratings yet
Multimedia Systems Chapter 7
21 pages
ECEVSP L03 Compression2
No ratings yet
ECEVSP L03 Compression2
40 pages
Algorithms in The Real World: Data Compression: Lectures 1 and 2
No ratings yet
Algorithms in The Real World: Data Compression: Lectures 1 and 2
55 pages
Image Compression
No ratings yet
Image Compression
50 pages
Advanced Multimedia Infrastructure
No ratings yet
Advanced Multimedia Infrastructure
32 pages
MMC Module 3
No ratings yet
MMC Module 3
65 pages
2017 May 24 Huffman Lecture1
No ratings yet
2017 May 24 Huffman Lecture1
24 pages
Lec-2 Source Coding v3.0
No ratings yet
Lec-2 Source Coding v3.0
10 pages
Data Compression
No ratings yet
Data Compression
46 pages
Chapter 4 - Introduction To Source Coding PDF
No ratings yet
Chapter 4 - Introduction To Source Coding PDF
72 pages
Unit 5 - Part-Ii
No ratings yet
Unit 5 - Part-Ii
41 pages
Ut 1 PPT
No ratings yet
Ut 1 PPT
77 pages
Chapter Three
No ratings yet
Chapter Three
30 pages
Coding Theory
No ratings yet
Coding Theory
49 pages
1.5 KNR2103 - Week 9 - Day 2 - PDF PDF
No ratings yet
1.5 KNR2103 - Week 9 - Day 2 - PDF PDF
52 pages
3-1-Lossless Compression
No ratings yet
3-1-Lossless Compression
10 pages
Chapter 3-Part II
100% (1)
Chapter 3-Part II
26 pages
cp467 12 Lecture14 Compression1
No ratings yet
cp467 12 Lecture14 Compression1
146 pages
Why Needed?: Without Compression, These Applications Would Not Be Feasible
No ratings yet
Why Needed?: Without Compression, These Applications Would Not Be Feasible
11 pages
L15 Compression
No ratings yet
L15 Compression
63 pages
DC-PPT 5
No ratings yet
DC-PPT 5
44 pages
UNIT-5 Entropy Encoding
No ratings yet
UNIT-5 Entropy Encoding
8 pages
Data Compression
No ratings yet
Data Compression
22 pages
Tutorial 8
No ratings yet
Tutorial 8
20 pages
Digital Comm Class Notes Personal
No ratings yet
Digital Comm Class Notes Personal
40 pages
Coding Techniques Important Questions-1
No ratings yet
Coding Techniques Important Questions-1
6 pages
Week 3
No ratings yet
Week 3
30 pages
Coding Line Coding Covered
No ratings yet
Coding Line Coding Covered
68 pages
Multimedia Data Compression
No ratings yet
Multimedia Data Compression
31 pages
ICT (Source Encoding & Channel Encoding)
No ratings yet
ICT (Source Encoding & Channel Encoding)
15 pages
ETN3046 Chapter 6
No ratings yet
ETN3046 Chapter 6
31 pages
Chapter10 Part1 Huffman
No ratings yet
Chapter10 Part1 Huffman
17 pages
Module-3: Text and Image Compression
No ratings yet
Module-3: Text and Image Compression
66 pages
KMA SS05 Kap03 Compression
No ratings yet
KMA SS05 Kap03 Compression
54 pages
Image Compression
100% (1)
Image Compression
111 pages
Error-Correction on Non-Standard Communication Channels
From Everand
Error-Correction on Non-Standard Communication Channels
Edward A. Ratzer
No ratings yet
A First Course in Wavelets with Fourier Analysis
From Everand
A First Course in Wavelets with Fourier Analysis
Albert Boggess
3.5/5 (2)
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
DopplerEffect PDF
No ratings yet
DopplerEffect PDF
5 pages
Ma87728 Mo1
No ratings yet
Ma87728 Mo1
2 pages
Ant Cab
No ratings yet
Ant Cab
93 pages
Components Robots
No ratings yet
Components Robots
7 pages
Course Promotional 3
No ratings yet
Course Promotional 3
15 pages
Computational Thinking in Secondary Mathematics Ed
No ratings yet
Computational Thinking in Secondary Mathematics Ed
33 pages
School of Commerce & Management Examination & Detailed Syllabus
No ratings yet
School of Commerce & Management Examination & Detailed Syllabus
171 pages
Assessment Criteria: Task Achievement (TA)
No ratings yet
Assessment Criteria: Task Achievement (TA)
2 pages
Zhang, El-Gohary - 2015 - Automated Extraction of Information From Building Information Models Into A Semantic Logic-Based Representatio
No ratings yet
Zhang, El-Gohary - 2015 - Automated Extraction of Information From Building Information Models Into A Semantic Logic-Based Representatio
9 pages
Discrete Mathematics and Its Applications 7th Global Edition by Rosen eBook and TestBank Bundle Download Instantly
No ratings yet
Discrete Mathematics and Its Applications 7th Global Edition by Rosen eBook and TestBank Bundle Download Instantly
350 pages
Notes On Management Information System
No ratings yet
Notes On Management Information System
3 pages
Iso 13567 1 1998
0% (2)
Iso 13567 1 1998
6 pages
001 Introduction To Media and Information Literacy
No ratings yet
001 Introduction To Media and Information Literacy
26 pages
Blie 228
No ratings yet
Blie 228
6 pages
Governance in The Public Sector
100% (1)
Governance in The Public Sector
124 pages
CP ENG1002 Technical English 29-08-2023
No ratings yet
CP ENG1002 Technical English 29-08-2023
20 pages
Planning Mechanism
No ratings yet
Planning Mechanism
16 pages
Ispg sm01 v1.2 - en
No ratings yet
Ispg sm01 v1.2 - en
79 pages
Good Data Won't Guarantee Good Decisions
No ratings yet
Good Data Won't Guarantee Good Decisions
3 pages
Information Systems Curriculum Feb 2021
33% (3)
Information Systems Curriculum Feb 2021
163 pages
Handbook of Scientific Methods of Inquiry For Intelligence Analysis Hank Prunckun - Download The Ebook Now For Instant Access To All Chapters
100% (1)
Handbook of Scientific Methods of Inquiry For Intelligence Analysis Hank Prunckun - Download The Ebook Now For Instant Access To All Chapters
78 pages
Information System
No ratings yet
Information System
7 pages
Unit 39 The Sound and Music Industry
No ratings yet
Unit 39 The Sound and Music Industry
13 pages
DLL MIL 2nd Quarter Week2
No ratings yet
DLL MIL 2nd Quarter Week2
3 pages
Smaw 11 Module 1
No ratings yet
Smaw 11 Module 1
18 pages
LESSON 3 Purposive Communication
No ratings yet
LESSON 3 Purposive Communication
14 pages
Esa Product Information1
No ratings yet
Esa Product Information1
8 pages
Strategies of Inquiry Narrative Research
No ratings yet
Strategies of Inquiry Narrative Research
4 pages
Atlas of Anatomy 4ed Test Bank Available Instantly
No ratings yet
Atlas of Anatomy 4ed Test Bank Available Instantly
408 pages
Lesson 2: Functions of Communication Objectives: Examples
No ratings yet
Lesson 2: Functions of Communication Objectives: Examples
9 pages
Pointers To Review in Oral Communication
No ratings yet
Pointers To Review in Oral Communication
4 pages
DPA Consent Form - Sample
No ratings yet
DPA Consent Form - Sample
1 page

3.source Coding Data Compression

Uploaded by

3.source Coding Data Compression

Uploaded by

NUST

CSE-434: Systems Engineering

Source Coding and Data Compression

Generic Communication System, from Chapter 2 of K.V. Prasad.

H   P i  log 2 P i  bits / symbol

• Answer: 2.5 bits/symbol

Character I T (D+U) (W+H) ! N O

Character (I+T) (D+U) (W+H) ! N O

Character (W+H) ! (I+T)+(D+U) N O

Character ((I+T)+(D+U)) ((W+H)+!) N O

Huffman Codes Huffman Tree

Exercise 3: Find the average number of bits/symbol for this code

You might also like