0% found this document useful (0 votes)
8 views

Basics of Compression

Uploaded by

Summiya Khan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Basics of Compression

Uploaded by

Summiya Khan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

1/22/2024

Basics of Compression

Dr. Sania Bhatti

Outline
 Need for compression and compression
algorithms classification
 Basic Coding Concepts
 Fixed-length
coding and variable-length coding
 Compression Ratio
 Entropy

 RLE Compression (Entropy Coding)


 Huffman Compression (Statistical Entropy Coding)

1
1/22/2024

Need for Compression


 Uncompressed video
 Uncompressed audio  640 x 480 resolution, 8 bit
 8 KHz, 8 bit color, 24 fps
 8K per second  7.37 Mbytes per second
 30M per hour  26.5 Gbytes per hour
 44.1 KHz, 16 bit  640 x 480 resolution, 24 bit
 88.2K per second (3 bytes) color, 30 fps
 317.5M per hour  27.6 Mbytes per second
 100 Gbyte disk holds 315  99.5 Gbytes per hour
hours of CD quality music  100 Gbyte disk holds 1 hour
of high quality video

Broad Classification
 Entropy Coding (statistical)
 lossless;independent of data characteristics
 e.g. RLE, Huffman, LZW, Arithmetic coding
 Source Coding
 lossy;may consider semantics of the data
 depends on characteristics of the data
 e.g. DCT, DPCM, ADPCM, color model transform
 Hybrid Coding (used by most multimedia systems)
 combine entropy with source encoding
 e.g., JPEG-2000, H.264, MPEG-2, MPEG-4, MPEG-7

2
1/22/2024

Data Compression
 Branch of information theory
 minimize amount of information to be
transmitted
 Transform a sequence of characters into a
new string of bits
 same information content
 length as short as possible

Concepts
 Coding (the code) maps source messages from
alphabet (A) into code words (B)

 Source message (symbol) is basic unit into which a


string is partitioned
 can be a single letter or a string of letters

 EXAMPLE: aa bbb cccc ddddd eeeeee fffffffgggggggg


 A = {a, b, c, d, e, f, g, space}
 B = {0, 1}
6

3
1/22/2024

Taxonomy of Codes
 Block-block
 source msgs and code words of fixed length; e.g.,
ASCII
 Block-variable
 sourcemessage fixed, code words variable; e.g.,
Huffman coding
 Variable-block
 source variable, code word fixed; e.g., RLE
 Variable-variable
 source variable, code words variable; e.g., Arithmetic
7

Example of Block-Block
 Coding “aa bbb cccc ddddd Symbol Code word
eeeeee fffffffgggggggg” a 000
b 001
 Requires 120 bits c 010
d 011
e 100

f 101
g 110
space 111

4
1/22/2024

Example of Variable-Variable
 Coding “aa bbb cccc ddddd Symbol Code word
eeeeee fffffffgggggggg” aa 0
bbb 1
 Requires 30 bits cccc 10
 don’t forget the spaces ddddd 11
eeeeee 100

fffffff 101
gggggggg 110
space 111

Concepts (cont.)
 A code is
 distinct
if each code word can be distinguished from
every other (mapping is one-to-one)
 uniquely decodable if every code word is identifiable
when immersed in a sequence of code words
 e.g., with previous table, message 11 could be defined as
either ddddd or bbbbbb

10

5
1/22/2024

Static Codes
 Mapping is fixed before transmission
 message represented by same codeword
every time it appears in message (ensemble)
 Huffman coding is an example

 Better for independent sequences


 probabilities
of symbol occurrences must be
known in advance;
11

Dynamic Codes
 Mapping changes over time
 also referred to as adaptive coding
 Attempts to exploit locality of reference
 periodic,
frequent occurrences of messages
 dynamic Huffman is an example

 Hybrids?
 build set of codes, select based on input

12

6
1/22/2024

Traditional Evaluation Criteria


 Algorithm complexity
 running time

 Amount of compression
 redundancy
 compression ratio

 How to measure?
13

Measure of Information
 Consider symbols si and the probability of
occurrence of each symbol p(si)
 In case of fixed-length coding , smallest
number of bits per symbol needed is
 L ≥ log2(N) bits per symbol
 Example: Message with 5 symbols need 3
bits (L ≥ log25)

14

7
1/22/2024

Variable-Length Coding-
Entropy
 What is the minimum number of bits per
symbol?
 Answer: Shannon’s result – theoretical
minimum average number of bits per code
word is known as Entropy (H)
n

  p(s ) log
i 1
i 2 p( si )

15

Entropy Example
 Alphabet = {A, B}
 p(A) = 0.4; p(B) = 0.6

 Compute Entropy (H)


 -0.4*log2 0.4 + -0.6*log2 0.6 = .97 bits

16

8
1/22/2024

Entropy Example
 Calculate the entropy for an image with
only two levels 0 and 255. P(0)=0.5 and
P(255)= 0.5

17

Entropy example
 A gray scale image has 256 levels A={ 0,
1, 2, ………….255} with equal
probabilities. Calculate Entropy.

 H= 256* (1/256)*log2(1/256) = 8bits

18

9
1/22/2024

Entropy Example
 Calculate the Entropy of aaabbbbccccdd
 P(a)= 0.23
 P(b) = 0.3
 P(c)=0.3
 P(d)= 0.15

19

10

You might also like