0% found this document useful (0 votes)

21 views

Lecture 1

This document discusses data compression concepts including lossless and lossy compression techniques. It covers why data is compressed, including to conserve storage space and reduce transmission time. It also discusses measures used to evaluate compression performance and factors to consider like complexity and standards.

Uploaded by

anushka

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views

Lecture 1

Uploaded by

anushka

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 35

Data Compression

Lecture 1
Basic Data Compression
original
Concepts
compressed decompressed

x y
Encoder Decoder xˆ

• Lossless compression x
– Also called entropy coding, reversible coding.
• Lossy compression x xˆ
– Also called irreversible coding.
• Compression ratio = xxˆ y
– x is number of bits in x.

2
Why
• Conserve storageCompress
space
• Reduce time for transmission
– Faster to encode, send, then decode than to send the
original
• Progressive transmission
– Some compression techniques allow us to send the
most important bits first so we can get a low
resolution version of some data before getting the
high fidelity version
• Reduce computation
– Use less data to achieve an approximate answer

3
Measures of performance
• Compression measures
– Compression ratio = Bits in original image
Bits in compressed image

– Bits per symbol

• Fidelity measures
2
– Mean square error (MSE)Avg( original - reconstructed)
– SNR - Signal to noise ratio10 log10 (Signal Power / Noise power)
– PSNR - Peak signal to noise ratio
– HVS based
04/02/2024 4
Other issues
• Encoder and decoder computation complexity
• Memory requirements
• Fixed rate or variable rate
• Error resilience
• Symmetric or asymmetric
• Decompress at multiple resolutions
• Decompress at various bit rates
• Standard or proprietary
04/02/2024 5
What is information?
• Semantic interpretation is subjective
• Statistical interpretation - Shannon 1948
– Self information i(A) associated with event A is
1
log 2
P ( A)
– More probable events have less information and
less probable events have more information.
– If A and B are two independent events then self
information i(AB) = i(A) + i(B)
04/02/2024 6
Braille
• System to read text by feeling raised dots on
paper (or on electronic displays). Invented in
1820s by Louis Braille, a French blind man.

a b c z

and the with

mother
th ch gh

7
Braille Example
Clear text:
Call me Ishmael. Some years ago -- never mind how long
precisely -- having \\ little or no money in my purse,
and nothing particular to interest me on shore, \\ I thought I
would sail about a little and see the watery part of the
world. (238 characters)
Grade 2 Braille in ASCII.
,call me ,i\%mael4 ,``s ye$>$s ago -- n``e m9d h[ l;g
precisely -- hav+ \\ ll or no m``oy 9 my purse1 \& no?+
``picul$>$ 6 9t]e/ me on \%ore1 \\ ,i $?$``$|$ ,i wd sail ab
a ll \& see ! wat]y ``p ( ! \_w4 (203 characters)

Compression ratio = 238/203 = 1.17

8
CSEP 590 - Lecture 1 - Autumn 2007
Lossless
• Data is not lost -Compression
the original is really needed.
– text compression
– compression of computer binary files
• Compression ratio typically no better than 4:1 for
lossless compression on many kinds of files.
• Statistical Techniques
– Huffman coding
– Arithmetic coding
– Golomb coding
• Dictionary techniques
– LZW, LZ77
– Sequitur
– Burrows-Wheeler Method
• Standards - Morse code, Braille, Unix compress, gzip, zip,
bzip, GIF, JBIG, Lossless JPEG
9
CSEP 590 - Lecture 1 - Autumn 2007
Lossy
• Compression
Data is lost, but not too much.
– audio
– video
– still images, medical images, photographs
• Compression ratios of 10:1 often yield quite high
fidelity results.
• Major techniques include
– Vector Quantization
– Wavelets
– Block transforms
– Standards - JPEG, JPEG2000, MPEG 2, H.264

10
CSEP 590 - Lecture 1 - Autumn 2007
Why is Data Compression
Possible
• Most data from nature has redundancy
– There is more data than the actual information
contained in the data.
– Squeezing out the excess data amounts to
compression.
– However, unsqueezing is necessary to be able to
figure out what the data means.
• Information theory is needed to understand
the limits of compression and give clues on
how to compress well.
11
What is Information
• Analog data
– Also called continuous data
– Represented by real numbers (or complex
numbers)
• Digital data
– Finite set of symbols {a1, a2, ... , am}
– All data represented as sequences (strings) in the
symbol set.
– Example: {a,b,c,d,r} abracadabra
– Digital data can be an approximation to analog
data
12
Symbols
• Roman alphabet plus punctuation
• ASCII - 256 symbols
• Binary - {0,1}
– 0 and 1 are called bits
– All digital information can be represented
efficiently in binary
– {a,b,c,d} fixed length representation
symbol a b c d
binary 00 01 10 11

– 2 bits per symbol

13
Exercise - How Many
Bits Per
• Symbol?
Suppose we have n symbols. How many
bits (as a function of n ) are needed in to represent
a symbol in binary?
– First try n a power of 2.

14
Discussion: Non-Powers of
Two
• Can we do better than a fixed length
representation for non-powers of two?

15
Information
Theory
• Developed by Shannon in the 1940’s and 50’s
• Attempts to explain the limits of communication
using probability theory.
• Example: Suppose English text is being sent
– It is much more likely to receive an “e” than a “z”.
– In some sense “z” has more information than “e”.

16
First-order
• Suppose we are Information
given symbols {a , a , ... , a }.
1 2 m
• P(ai) = probability of symbol ai occurring in the
absence of any other information.
P(a1) + P(a2) + ... + P(am) = 1
• inf(ai) = log2(1/P(ai)) bits is the information of ai
in bits. 7
6
5
-log(x)
4
y

3
2
1
0
0.5
0.01
0.08

0.15

0.22
0.29

0.36
0.43

0.57
0.64

0.71
0.78

0.85

0.92
0.99
x

17
Example
• {a, b, c} with P(a) = 1/8, P(b) = 1/4, P(c) = 5/8
– inf(a) = log2(8) = 3
– inf(b) = log2(4) = 2
– inf(c) = log2(8/5) = .678
• Receiving an “a” has more information than
receiving a “b” or “c”.

18
First Order
•
Entropy
The first order entropy is defined for a probability
distribution over symbols {a1, a2, ... , am}.
m
1
)
H ∑
i1 P(ai ) log2 ( P(ai )
• H is the average number of bits required to code up a
symbol, given all we know is the probability distribution of
the symbols.
• H is the Shannon lower bound on the average number of bits
to code a symbol in this “source model”.
• Stronger models of entropy include context.

19
Entropy Examples
• {a, b, c} with a 1/8, b 1/4, c 5/8.
– H = 1/8 *3 + 1/4 *2 + 5/8* .678 = 1.3 bits/symbol

• {a, b, c} with a 1/3, b 1/3, c 1/3. (worst case)

– H = 3* (1/3)*log2(3) = 1.6 bits/symbol

• Note that a standard code takes 2 bits per

symbol
symbol a b c
binary code 00 01 10

20
Entropy
•
Curve
Suppose we have two symbols with probabilities x
and 1-x, respectively.
maximum entropy at .5
1.2

1 -(x log x + (1-x)log(1-x))

0.8
entropy

0.6

0.4

0.2

0
0

1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9

p
r
o 22
b
a
b
A Simple Prefix Code
• {a, b, c} with a 1/8, b 1/4, c 5/8.
• A prefix code is defined by a binary tree
• Prefix code property
– no output is a prefix of another

binary tree
0 1 a 00
c code
0 1 b 01
a b
c 1

ccabccbccc
1 1 00 01 1 1 01 1 1 1
23
Binary Tree Terminology
root

node

leaf

1. Each node, except the root, has a unique parent.

2. Each internal node has exactly two children.

24
Decoding a Prefix Code
repeat
start at root of tree
0 1
repeat
c
0 1 if read bit = 1 then go right else
go left
a b
until node is a leaf
report leaf
until end of the code

11000111100

25
Decoding a Prefix Code
0 1
c
0 1
a b

11000111100

26
Decoding a Prefix Code
0 1
c
0 1
a b

11000111100

27
Decoding a Prefix Code
0 1
c
0 1
a b

11000111100

28
Decoding a Prefix Code
0 1
c
0 1
a b

11000111100

29
Decoding a Prefix Code
0 1
c
0 1
a b

11000111100

30
Decoding a Prefix Code
0 1
c
0 1
a b

11000111100

31
Decoding a Prefix Code
0 1
c
0 1
a b

11000111100

cca

32
Decoding a Prefix Code
0 1
c
0 1
a b

11000111100

cca

33
Decoding a Prefix Code
0 1
c
0 1
a b

11000111100

cca

34
Decoding a Prefix Code
0 1
c
0 1
a b

11000111100

ccab

35
Decoding a Prefix Code
0 1
c
0 1
a b

11000111100

ccabccca

Module 5 - Part1
No ratings yet
Module 5 - Part1
36 pages
The Absolute Beginner's Guide to Binary, Hex, Bits, and Bytes! How to Master Your Computer's Love Language
From Everand
The Absolute Beginner's Guide to Binary, Hex, Bits, and Bytes! How to Master Your Computer's Love Language
Greg Perry
4.5/5 (11)
CSEP 590 Data Compression: Course Policies Introduction To Data Compression Entropy Variable Length Codes
No ratings yet
CSEP 590 Data Compression: Course Policies Introduction To Data Compression Entropy Variable Length Codes
93 pages
chap2
No ratings yet
chap2
47 pages
EC 2214: Coding & Data Compression: Vishwakarma Institute of Technology
No ratings yet
EC 2214: Coding & Data Compression: Vishwakarma Institute of Technology
35 pages
Data compression
No ratings yet
Data compression
26 pages
Lossless Math
No ratings yet
Lossless Math
32 pages
Sayood DataCompression
No ratings yet
Sayood DataCompression
22 pages
Data Compression Lecture01
No ratings yet
Data Compression Lecture01
87 pages
Noise, Information Theory, and Entropy: CS414 - Spring 2007
No ratings yet
Noise, Information Theory, and Entropy: CS414 - Spring 2007
44 pages
Algorithms in The Real World: Data Compression: Lectures 1 and 2
No ratings yet
Algorithms in The Real World: Data Compression: Lectures 1 and 2
55 pages
CHAPTER 7
No ratings yet
CHAPTER 7
36 pages
Lecture
No ratings yet
Lecture
75 pages
Compression For Sending and Storing Information: Text, Audio, Images, Videos
No ratings yet
Compression For Sending and Storing Information: Text, Audio, Images, Videos
28 pages
Introduction To Data Compression - Guy E. Blelloch PDF
No ratings yet
Introduction To Data Compression - Guy E. Blelloch PDF
54 pages
Basic Data Compression Concepts: - Lossless - Lossy - Compression Ratio
No ratings yet
Basic Data Compression Concepts: - Lossless - Lossy - Compression Ratio
9 pages
Source Coding
No ratings yet
Source Coding
29 pages
Lecture 3-Huffman Coding
No ratings yet
Lecture 3-Huffman Coding
30 pages
Data Compression
No ratings yet
Data Compression
49 pages
Book-Chapter-07 (Lossless Compression Algorithms) Merged
No ratings yet
Book-Chapter-07 (Lossless Compression Algorithms) Merged
25 pages
Intro To ICT 11
No ratings yet
Intro To ICT 11
31 pages
CH 6
No ratings yet
CH 6
21 pages
Lecture I: Data Compression Data Encoding: Efficient Information Encoding To
No ratings yet
Lecture I: Data Compression Data Encoding: Efficient Information Encoding To
48 pages
Chapter Five Lossless Compression
No ratings yet
Chapter Five Lossless Compression
49 pages
01-Syllabus and Intro
No ratings yet
01-Syllabus and Intro
21 pages
Lossless Compression: Lesson 1
No ratings yet
Lossless Compression: Lesson 1
10 pages
cp467_12_lecture14_compression1
No ratings yet
cp467_12_lecture14_compression1
146 pages
Advanced Multimedia Infrastructure
No ratings yet
Advanced Multimedia Infrastructure
32 pages
3-1-Lossless Compression
No ratings yet
3-1-Lossless Compression
10 pages
Chapter 2 - Edited
No ratings yet
Chapter 2 - Edited
45 pages
Data Compression: Chapter - 2 Mathematical Preliminaries For Lossless Compression
100% (2)
Data Compression: Chapter - 2 Mathematical Preliminaries For Lossless Compression
26 pages
Basics of Compression: Goals
No ratings yet
Basics of Compression: Goals
15 pages
Lecture 3
No ratings yet
Lecture 3
48 pages
Compression PDF
No ratings yet
Compression PDF
55 pages
Week 3
No ratings yet
Week 3
30 pages
Data Compression 2
No ratings yet
Data Compression 2
19 pages
Why Needed?: Without Compression, These Applications Would Not Be Feasible
No ratings yet
Why Needed?: Without Compression, These Applications Would Not Be Feasible
11 pages
HTCS501 unit 4
No ratings yet
HTCS501 unit 4
17 pages
Chapter 2 - Mathematical Preliminaries For Lossless Compression
No ratings yet
Chapter 2 - Mathematical Preliminaries For Lossless Compression
56 pages
Data Compression
No ratings yet
Data Compression
20 pages
Introduction To Multimedia Compression: National Chiao Tung University Chun-Jen Tsai 2/21/2012
No ratings yet
Introduction To Multimedia Compression: National Chiao Tung University Chun-Jen Tsai 2/21/2012
22 pages
15-583:algorithms in The Real World: Data Compression I - Introduction - Information Theory - Probability Coding
No ratings yet
15-583:algorithms in The Real World: Data Compression I - Introduction - Information Theory - Probability Coding
33 pages
Ic23 Unit02 Script
No ratings yet
Ic23 Unit02 Script
29 pages
Noise, Information Theory, and Entropy
No ratings yet
Noise, Information Theory, and Entropy
34 pages
Image and Video Compression: Lecture 12, April 27, 2009 Lexing Xie
No ratings yet
Image and Video Compression: Lecture 12, April 27, 2009 Lexing Xie
77 pages
Entropy, Coding and Data Compression
No ratings yet
Entropy, Coding and Data Compression
33 pages
Forouzan6e ch11 PPTs Accessible
No ratings yet
Forouzan6e ch11 PPTs Accessible
119 pages
2017 May 24 Huffman Lecture1
No ratings yet
2017 May 24 Huffman Lecture1
24 pages
Information Coding Techniques
No ratings yet
Information Coding Techniques
42 pages
Information Theory
No ratings yet
Information Theory
38 pages
Compressor Principles
No ratings yet
Compressor Principles
32 pages
Lecture 2-Print
No ratings yet
Lecture 2-Print
19 pages
Data Compression
No ratings yet
Data Compression
22 pages
Ec8093-Digital Image Processing: Dr.K.Kalaivani Associate Professor Dept. of EIE Easwari Engineering College
No ratings yet
Ec8093-Digital Image Processing: Dr.K.Kalaivani Associate Professor Dept. of EIE Easwari Engineering College
37 pages
Module 2
No ratings yet
Module 2
47 pages
Image Compression: Efficient Techniques for Visual Data Optimization
From Everand
Image Compression: Efficient Techniques for Visual Data Optimization
Fouad Sabry
No ratings yet
The Secret Science of Ciphers
From Everand
The Secret Science of Ciphers
Nick D'Alto
No ratings yet
Colour Banding: Exploring the Depths of Computer Vision: Unraveling the Mystery of Colour Banding
From Everand
Colour Banding: Exploring the Depths of Computer Vision: Unraveling the Mystery of Colour Banding
Fouad Sabry
No ratings yet
CompTIA Network+: Untangling Ethernet, Herding Packets, and Conquering Connectivity Chaos
From Everand
CompTIA Network+: Untangling Ethernet, Herding Packets, and Conquering Connectivity Chaos
Scott Markham
No ratings yet
Digital Technical Theater Simplified: High Tech Lighting, Audio, Video and More on a Low Budget
From Everand
Digital Technical Theater Simplified: High Tech Lighting, Audio, Video and More on a Low Budget
Drew Campbell
No ratings yet
3D Viewing
100% (1)
3D Viewing
3 pages
Lecture 5
No ratings yet
Lecture 5
31 pages
Lecture 4
No ratings yet
Lecture 4
18 pages
Lecture 3
No ratings yet
Lecture 3
22 pages
Evolution of Bluetooth PDF
No ratings yet
Evolution of Bluetooth PDF
2 pages
C&A Electronic: Analog Terrestrial Wide Band Television Exciter Type CA6450 SERIES
No ratings yet
C&A Electronic: Analog Terrestrial Wide Band Television Exciter Type CA6450 SERIES
14 pages
Download
No ratings yet
Download
10 pages
23 Sampling
No ratings yet
23 Sampling
13 pages
HW 5
No ratings yet
HW 5
5 pages
RF Transistor List 189
100% (2)
RF Transistor List 189
4 pages
10 Meter Beacon List
No ratings yet
10 Meter Beacon List
8 pages
Anti drone ppt in previous ppt theme
No ratings yet
Anti drone ppt in previous ppt theme
32 pages
UTP Cat.8
No ratings yet
UTP Cat.8
1 page
6 Exercise in Digital Information Processing
No ratings yet
6 Exercise in Digital Information Processing
1 page
An Ultra-Miniaturized Antenna Using Loading Circuit Method For Medical Implant Applications
No ratings yet
An Ultra-Miniaturized Antenna Using Loading Circuit Method For Medical Implant Applications
9 pages
Literature Review
0% (1)
Literature Review
4 pages
Practical MATLAB Basics for Engineers 1st Edition by Misza Kalechman 1420047744 9781420047745 - The newest ebook version is ready, download now to explore
100% (4)
Practical MATLAB Basics for Engineers 1st Edition by Misza Kalechman 1420047744 9781420047745 - The newest ebook version is ready, download now to explore
78 pages
AIW-356 DQ-N01 User Manual (2024-07)_V1.0.1
No ratings yet
AIW-356 DQ-N01 User Manual (2024-07)_V1.0.1
65 pages
Site ID 6229 Site Name: BSC Name BSC16 Site On-Air Date: Cell ID 62291,62292,62293 Drive Test Date: ALT (M)
No ratings yet
Site ID 6229 Site Name: BSC Name BSC16 Site On-Air Date: Cell ID 62291,62292,62293 Drive Test Date: ALT (M)
21 pages
Huawei OLT MA5800-en
No ratings yet
Huawei OLT MA5800-en
2 pages
Wireless Communication Lab Report Group 5
No ratings yet
Wireless Communication Lab Report Group 5
7 pages
Low and High Transmitter Block Diagram
No ratings yet
Low and High Transmitter Block Diagram
6 pages
A 128/512/1024/2048-Point Pipeline Fft/Ifft Architecture For Mobile Wimax
No ratings yet
A 128/512/1024/2048-Point Pipeline Fft/Ifft Architecture For Mobile Wimax
2 pages
Wireless Man
No ratings yet
Wireless Man
58 pages
Role of Switches in Networks
No ratings yet
Role of Switches in Networks
7 pages
ATX Presentation - San Diego Product Line
No ratings yet
ATX Presentation - San Diego Product Line
109 pages
Proposed Syllabus Electronics
No ratings yet
Proposed Syllabus Electronics
122 pages
Modelling and Control Strategy For Single Phase Grid Connected PV System
No ratings yet
Modelling and Control Strategy For Single Phase Grid Connected PV System
33 pages
Small FM Radio Circuit
No ratings yet
Small FM Radio Circuit
2 pages
Coherence Bandwidth (B) : - It Is A Definition That Depends On RMS Delay Spread
No ratings yet
Coherence Bandwidth (B) : - It Is A Definition That Depends On RMS Delay Spread
25 pages
SAZR
No ratings yet
SAZR
4 pages
Simple 2 4 Datasheet En-1
No ratings yet
Simple 2 4 Datasheet En-1
2 pages
Overview of Data Communications and Networking: Mcgraw-Hill ©the Mcgraw-Hill Companies, Inc., 2004
No ratings yet
Overview of Data Communications and Networking: Mcgraw-Hill ©the Mcgraw-Hill Companies, Inc., 2004
25 pages
Directional Comparison Blocking Fundamentals - WRPC 2012
No ratings yet
Directional Comparison Blocking Fundamentals - WRPC 2012
23 pages