0% found this document useful (0 votes)

6 views25 pages

Data Compression

Uploaded by

nouryones38

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views25 pages

Data Compression

Uploaded by

nouryones38

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

1/8/23

Contents

1 Data Compression

2 Lossless Compression Methods

3 Lossy Compression Methods

7 LOGO
7

Data Compression

8 LOGO
8

1
1/8/23

File Compression
v Reasons for file compression:

§ Less storage

§ Transmitting faster, decreasing access time

§ Processing faster sequentially

9 LOGO
9

Why Data Compression?

v Make optimal use of limited storage space

v Save time and help to optimize resources

§ If compression and decompression are done in I/O

processor, less time is required to move data to or
from storage subsystem, freeing I/O bus for other
work

§ In sending data over communication line: less time

to transmit and less storage to host

10 LOGO
10

2
1/8/23

Data Compression- Entropy

v Entropy is the measure of information content in a message.

v Messages with higher entropy carry more information than

messages with lower entropy.

v How to determine the entropy:

§ Find the probability p(x) of symbol x in the message
§ The entropy H(x) of the symbol x is:
H(x) = - p(x) • log2p(x)

v The average entropy over the entire message is the sum of

the entropy of all n symbols in the message

11 LOGO
11

Data Compression Methods

v Data compression is about storing and sending a smaller
number of bits.

v There’re two major categories for methods to compress

data: lossless and lossy methods

12 LOGO
12

3
1/8/23

Lossless Compression
Methods

13 LOGO
13

Lossless Compression Methods

v In lossless methods, original data and the data after
compression and decompression are exactly the same.

v Redundant data is removed in compression and added

during decompression.

v Lossless methods are used when we can’t afford to lose

any data: legal and medical documents, computer
programs.

14 LOGO
14

4
1/8/23

Run-length
Encoding

15 LOGO
15

Run-length Encoding
v Simplest method of compression.

v Replace consecutive repeating occurrences of a symbol

by one occurrence of the symbol itself, then followed by
the number of occurrences.

16 LOGO
16

5
1/8/23

Run-length Encoding

é0 0 0 0 0 0 0 0 ù
ê0 0 255 0 255 0 0 0 úú
ê
ê0 0 0 0 0 255 0 0 ú
ê0 0 0 0 0 0 0 0 ú
ê0 0 255 0 0 0 0 0 ú
ê ú
ê0 0 0 0 0 0 0 0 ú
ê0 0 0 0 0 255 0 0 ú
ê ú
ëê0 0 0 0 0 0 0 255ûú

17 LOGO
17

Run-length Encoding
v Run-length encoding algorithm:
§ Read through pixels, copying pixel values to file in
sequence, except the same pixel value occurs more
than once in succession
§ When the same value occurs more than once in
succession, substitute the following three bytes:
• special run-length code indicator(e.g. 0xFF)
• pixel value repeated
• the number of times that value is repeated

v Example:
22 23 24 24 24 24 24 24 24 25 26 26 26 26 26 26 25 24
RL-coded stream: 22 23 ff 24 07 25 ff 26 06 25 24
18 LOGO
18

6
1/8/23

Run-length Encoding
v Run-length encoding algorithm:
§ It is an example of redundancy reduction

§ Drawbacks:
• not guarantee any particular amount of space
savings
• under some circumstances, compressed image is
larger than original image
• Why? Can you prevent this?

19 LOGO
19

Huffman
Coding

20 LOGO
20

7
1/8/23

Huffman Coding
v Assign fewer bits to symbols that occur more
frequently and more bits to symbols appear less
often.

v There’s no unique Huffman code and every Huffman

code has the same average code length.

21 LOGO
21

Huffman Coding
v Algorithm:
1. Make a leaf node for each code symbol
• Add the generation probability of each symbol to the leaf
node

2. Take the two leaf nodes with the smallest probability

and connect them into a new node
• Add 1 or 0 to each of the two branches
• The probability of the new node is the sum of the probabilities
of the two connecting nodes

3. If there is only one node left, the code construction

is completed. If not, go back to (2)

22 LOGO
22

8
1/8/23

Huffman Coding
2
v Example:

1 3

23 LOGO
23

Huffman Coding
v Encoding

v Decoding

24 LOGO
24

9
1/8/23

Huffman Coding
v Example:

25 LOGO
25

Lempel-Ziv
Codes

26 LOGO
26

10
1/8/23

Lempel-Ziv Codes
v There are several variations of Lempel-Ziv Codes.

v We will look at LZ78

v Commands zip and unzip and Unix compress and

uncompress use Lempel-Ziv codes

27 LOGO
27

Lempel-Ziv Codes
v Let us look at an example for an alphabet having only two
letters:
aaababbbaaabaaaaaaabaabb

v Rule
§ Separate this stream of characters into pieces of text
so that each piece is the shortest string of characters
that we have not seen yet.

a | a a | b | a b | b b | a a a | b a| a a a a | a a b | a a b b

28 LOGO
28

11
1/8/23

Lempel-Ziv Codes
aaababbbaaabaaaaaaabaabb
a | a a | b | a b | b b | a a a | b a| a a a a | a a b | a a b b

1. We see “a”
2. “a” has been seen, we now see “aa”
3. We see “b”
4. “a” has been seen, we now see “ab”
5. “b” has been seen, we now see “bb”
6. “aa” has been seen, we now see “aaa”
7. “b” has been seen, we now see “ba”
8. “aaa” has been seen, we now see “aaaa”
9. “aa” has been seen, we now see “aab”
10. “aab” has been seen, we now see “aabb”

29 LOGO
29

Lempel-Ziv Codes
v Index:

§ We have index values from 1 to n

§ For the previous example

1 2 3 4 5 6 7 8 9 10
a | a a | b | a b | b b | a a a | b a| a a a a | a a b | a a b b

30 LOGO
30

12
1/8/23

Lempel-Ziv Codes
v Encoding:

§ Since each piece is the concatenation of a piece

already seen with a new character, the message can
be encoded by a previous index plus a new character.

1 2 3 4 5 6 7 8 9 10
a | a a | b | a b | b b | a a a | b a| a a a a | a a b | a a b b

1 2 3 4 5 6 7 8 9 10
0a|1a|0b| 1b|3b|2a|3a|6a|2b|9b

31 LOGO
31

Lempel-Ziv Codes
v Encoding tree:
§ A tree can be built
when encoding

1 2 3 4 5 6 7 8 9 10
a | a a | b | a b | b b | a a a | 32b a| a a a a | a a b | a aLOGO
bb
32

13
1/8/23

Lempel-Ziv Codes
v Exercise No. 1:

§ Encode the file containing the following

characters, drawing the corresponding digital
tree.

“aaabbcbcdddeab”

33 LOGO
33

Lempel-Ziv Codes
v Exercise No. 1:

1 2 3 4 5 6 7 8
a|aa|b|bc|bcd|d|de|ab

0a|1a|0b|3c|4d|0d|6e|1b

34 LOGO
34

14
1/8/23

Lempel-Ziv Codes
v Exercise No. 1:

1 2 3 4 5 6 7 8
a|aa|b|bc|bcd|d|de|ab

0a|1a|0b|2c|4d|0d|6e|1b

35 LOGO
35

Lempel Ziv Encoding

v It is dictionary-based encoding

v Basic idea:
§ Create a dictionary (a table) of strings used during
communication.

§ If both sender and receiver have a copy of the dictionary,

then previously-encountered strings can be substituted by
their index in the dictionary.

36 LOGO
36

15
1/8/23

Lempel Ziv Compression

v Have 2 phases:
1. Building an indexed dictionary
2. Compressing a string of symbols

v Algorithm:
1. Extract the smallest substring that cannot be found in the
remaining uncompressed string.
2. Store that substring in the dictionary as a new entry and assign
it an index value
3. Substring is replaced with the index found in the dictionary
4. Insert the index and the last character of the substring into the
compressed string

37 LOGO
37

Lempel Ziv Compression

v Compression
example:

38 LOGO
38

16
1/8/23

Lempel Ziv Decompression

v It’s just the inverse of
compression process

39 LOGO
39

Lossy Compression
Methods

40 LOGO
40

17
1/8/23

Lossy Compression Methods

v Used for compressing images and video files (our
eyes cannot distinguish subtle changes, so lossy
data is acceptable).

v These methods are cheaper, less time and space.

v Several methods:
v JPEG: compress pictures and graphics
v MPEG: compress video
v MP3: compress audio

41 LOGO
41

Lossy Compression Methods

42 LOGO
42

18
1/8/23

Lossy Compression Methods

43 LOGO
43

JPEG Encoding

44 LOGO
44

19
1/8/23

JPEG Encoding
v Used to compress pictures and graphics.
v In JPEG, a grayscale picture is divided into 8x8 pixel blocks
to decrease the number of calculations.
v Basic idea:
1. Change the picture into a linear (vector) sets of numbers that
reveals the redundancies.
2. The redundancies is then removed by one of lossless compression
methods.

45 LOGO
45

JPEG Encoding- DCT

v DCT: Discrete Concise Transform
v DCT transforms the 64 values in 8x8 pixel block in a way
that the relative relationships between pixels are kept but
the redundancies are revealed.
v Example:
A gradient grayscale

46 LOGO
46

20
1/8/23

Quantization & Compression

v Quantization:

§ After T table is created, the values are quantized to reduce the number
of bits needed for encoding.

§ Quantization divides the number of bits by a constant, then drops the

fraction. This is done to optimize the number of bits and the number of
0s for each particular application.

47 LOGO
47

Quantization & Compression

v Compression:

§ Quantized values are read from the table and redundant 0s are
removed.

§ To cluster the 0s together, the table is read diagonally in an zigzag

fashion. The reason is if the table doesn’t have fine changes, the
bottom right corner of the table is all 0s.

§ JPEG usually uses lossless run-length encoding at the compression

phase.

48 LOGO
48

21
1/8/23

JPEG Encoding

49 LOGO
49

MPEG Encoding

50 LOGO
50

22
1/8/23

MPEG Encoding
v Used to compress video.

v Basic idea:

v Each video is a rapid sequence of a set of frames.

v Each frame is a spatial combination of pixels, or a picture.

v Compressing video =
spatially compressing each frame
+
temporally compressing a set of frames.
51 LOGO
51

MPEG Encoding
v Spatial Compression
v Each frame is spatially compressed by JPEG.
v Temporal Compression
v Redundant frames are removed.
v For example, in a static scene in which someone is talking, most
frames are the same except for the segment around the
speaker’s lips, which changes from one frame to the next.

52 LOGO
52

23
1/8/23

Audio Encoding

53 LOGO
53

Audio Compression
vUsed for speech or music
§ Speech: compress a 64 kHz digitized signal
§ Music: compress a 1.411 MHz signal

vTwo categories of techniques:

§ Predictive encoding
§ Perceptual encoding

54 LOGO
54

24
1/8/23

Audio Encoding
v Predictive Encoding
§ Only the differences between samples are encoded, not the
whole sample values.
§ Several standards: GSM (13 kbps), G.729 (8 kbps), and G.723.3
(6.4 or 5.3 kbps)

v Perceptual Encoding: MP3

§ CD-quality audio needs at least 1.411 Mbps and cannot be sent
over the Internet without compression.
§ MP3 (MPEG audio layer 3) uses perceptual encoding technique
to compress audio.

55 LOGO
55

Chapter 4 - Introduction To Source Coding
No ratings yet
Chapter 4 - Introduction To Source Coding
72 pages
Data Compression
No ratings yet
Data Compression
35 pages
Chapter 6 Organizing Files For Performance Not Complete
No ratings yet
Chapter 6 Organizing Files For Performance Not Complete
65 pages
Unit 5 Data Compression
No ratings yet
Unit 5 Data Compression
98 pages
ECE359 - Image Compression
No ratings yet
ECE359 - Image Compression
42 pages
Unit3 Ece MMC 6th Sem
No ratings yet
Unit3 Ece MMC 6th Sem
96 pages
MMC Module 3
No ratings yet
MMC Module 3
65 pages
L15 Compression
No ratings yet
L15 Compression
63 pages
Data Structures and Algorithms Compression Methods
No ratings yet
Data Structures and Algorithms Compression Methods
21 pages
Lecture I: Data Compression Data Encoding: Efficient Information Encoding To
No ratings yet
Lecture I: Data Compression Data Encoding: Efficient Information Encoding To
48 pages
Compression
No ratings yet
Compression
21 pages
Chapter 7
No ratings yet
Chapter 7
70 pages
Ut 1 PPT
No ratings yet
Ut 1 PPT
77 pages
Chapter 4 - Introduction To Source Coding PDF
No ratings yet
Chapter 4 - Introduction To Source Coding PDF
72 pages
Image and Video Compression: Lecture 12, April 27, 2009 Lexing Xie
No ratings yet
Image and Video Compression: Lecture 12, April 27, 2009 Lexing Xie
77 pages
Data Compression Chapter 7
No ratings yet
Data Compression Chapter 7
40 pages
Mod 3
No ratings yet
Mod 3
69 pages
Chapter Presentation
No ratings yet
Chapter Presentation
57 pages
Chapter 3 Multimedia Data Compression
100% (2)
Chapter 3 Multimedia Data Compression
23 pages
MMC Chap3
100% (1)
MMC Chap3
22 pages
Image Compression
100% (1)
Image Compression
38 pages
Introduction and Overview: 1.1 Physics of Information
No ratings yet
Introduction and Overview: 1.1 Physics of Information
739 pages
Module IV
No ratings yet
Module IV
37 pages
Presentation Layer & Application Layer
No ratings yet
Presentation Layer & Application Layer
9 pages
The Assignment Problem: SMS 4674 / SMS 3392 Operational Research
100% (2)
The Assignment Problem: SMS 4674 / SMS 3392 Operational Research
33 pages
Audio and Video Coding PDF
No ratings yet
Audio and Video Coding PDF
72 pages
Chap15 1473751047 598113
No ratings yet
Chap15 1473751047 598113
34 pages
Cs502 Midterm Solved Mcqs by Junaid
100% (2)
Cs502 Midterm Solved Mcqs by Junaid
47 pages
Unit 1 Data Compression
No ratings yet
Unit 1 Data Compression
30 pages
Data Compression Techniques
No ratings yet
Data Compression Techniques
29 pages
Compression: Some Slides Courtesy James Allan@umass
No ratings yet
Compression: Some Slides Courtesy James Allan@umass
47 pages
Information Theory: Mohamed Hamada
No ratings yet
Information Theory: Mohamed Hamada
44 pages
Data Compression
No ratings yet
Data Compression
22 pages
Introduction To Data Compression - Guy E. Blelloch PDF
No ratings yet
Introduction To Data Compression - Guy E. Blelloch PDF
54 pages
Multimedia System: Chapter Eight: Multimedia Data Compression
No ratings yet
Multimedia System: Chapter Eight: Multimedia Data Compression
29 pages
Chapter 5 New
No ratings yet
Chapter 5 New
19 pages
Image Compression
No ratings yet
Image Compression
50 pages
Nen Anh
No ratings yet
Nen Anh
36 pages
Analog & Digital Communication Presentation On Data Compression
No ratings yet
Analog & Digital Communication Presentation On Data Compression
31 pages
Fundamentals of Compression: Prepared By: Haval Akrawi
No ratings yet
Fundamentals of Compression: Prepared By: Haval Akrawi
21 pages
5.2 Design of A Simple Code Generator
No ratings yet
5.2 Design of A Simple Code Generator
24 pages
Lossless Compression
No ratings yet
Lossless Compression
36 pages
Mad Unit 3-Jntuworld
No ratings yet
Mad Unit 3-Jntuworld
53 pages
Unit 1 Compiler Design
No ratings yet
Unit 1 Compiler Design
70 pages
Chapter 3 Multimedia Data Compression
No ratings yet
Chapter 3 Multimedia Data Compression
21 pages
Assembly Programming:Simple, Short, And Straightforward Way Of Learning Assembly Language
From Everand
Assembly Programming:Simple, Short, And Straightforward Way Of Learning Assembly Language
Sherwyn Allibang
5/5 (2)
Multimedia Systems Chapter 7
No ratings yet
Multimedia Systems Chapter 7
21 pages
Compression Techniques
No ratings yet
Compression Techniques
24 pages
Aadel Veri
No ratings yet
Aadel Veri
37 pages
C++ VS JAVA A PERFORMANCE DEEPDIVE: Unraveling the Performance Characteristics of C++ and Java for High-Performance Computing
From Everand
C++ VS JAVA A PERFORMANCE DEEPDIVE: Unraveling the Performance Characteristics of C++ and Java for High-Performance Computing
Manoj R Chakravarthi
No ratings yet
Ec8093-Digital Image Processing: Dr.K.Kalaivani Associate Professor Dept. of EIE Easwari Engineering College
No ratings yet
Ec8093-Digital Image Processing: Dr.K.Kalaivani Associate Professor Dept. of EIE Easwari Engineering College
37 pages
Assignment Problem - Maranan, A2A
No ratings yet
Assignment Problem - Maranan, A2A
2 pages
CHAPTER FOURmultimedia
No ratings yet
CHAPTER FOURmultimedia
23 pages
Compressor Principles
No ratings yet
Compressor Principles
32 pages
CH 15
No ratings yet
CH 15
34 pages
Compression and Decompression Techniques
No ratings yet
Compression and Decompression Techniques
68 pages
Data Compression Techniques
No ratings yet
Data Compression Techniques
41 pages
Seminar Data Compression
No ratings yet
Seminar Data Compression
32 pages
Inheritance in Java
No ratings yet
Inheritance in Java
6 pages
Image Compression
No ratings yet
Image Compression
33 pages
Divide and Conquer (Design and Analysis of Algorithm)
No ratings yet
Divide and Conquer (Design and Analysis of Algorithm)
19 pages
Compression For Sending and Storing Information: Text, Audio, Images, Videos
No ratings yet
Compression For Sending and Storing Information: Text, Audio, Images, Videos
28 pages
Data Compression: CS 147 Minh Nguyen
No ratings yet
Data Compression: CS 147 Minh Nguyen
25 pages
Data Compression
No ratings yet
Data Compression
21 pages
002 Lu - Decomposition Presentation v3
No ratings yet
002 Lu - Decomposition Presentation v3
35 pages
C in 30 Pages
From Everand
C in 30 Pages
U.Q. Magnusson
4.5/5 (2)
Why Needed?: Without Compression, These Applications Would Not Be Feasible
No ratings yet
Why Needed?: Without Compression, These Applications Would Not Be Feasible
11 pages
Routing Problems 6.1. Vehicle Routing Problems Vehicle Routing Problem, VRP
No ratings yet
Routing Problems 6.1. Vehicle Routing Problems Vehicle Routing Problem, VRP
6 pages
Question Bank UM19MB602: Introduction To Machine Learning Unit 4: Decision Tree
No ratings yet
Question Bank UM19MB602: Introduction To Machine Learning Unit 4: Decision Tree
4 pages
Compression Error Detection & Correction: - Compression: Squeeze Out Redundancy
No ratings yet
Compression Error Detection & Correction: - Compression: Squeeze Out Redundancy
12 pages
Byte Manipulation Functions
No ratings yet
Byte Manipulation Functions
2 pages
CS 211 Term 1 Assignment PDF
No ratings yet
CS 211 Term 1 Assignment PDF
3 pages
Table of LFSR
No ratings yet
Table of LFSR
6 pages
547f39f7a0142 Network Analysis
No ratings yet
547f39f7a0142 Network Analysis
38 pages
Slip NO - 3
No ratings yet
Slip NO - 3
9 pages
End-To-End Task in Heterogeneous Systems
No ratings yet
End-To-End Task in Heterogeneous Systems
13 pages
Audio Visual Speech Recognition: Advancements, Applications, and Insights
From Everand
Audio Visual Speech Recognition: Advancements, Applications, and Insights
Fouad Sabry
No ratings yet
Lect 4
No ratings yet
Lect 4
14 pages
Ass1 PDF
No ratings yet
Ass1 PDF
2 pages
Sorting Algorithm Implemented in Python
No ratings yet
Sorting Algorithm Implemented in Python
6 pages
EE 605: Error Correcting Codes
No ratings yet
EE 605: Error Correcting Codes
1 page
COMP7015 Assignment 1 (Typo Fixed)
No ratings yet
COMP7015 Assignment 1 (Typo Fixed)
3 pages
LP Validation Process
No ratings yet
LP Validation Process
2 pages
Master Method Cheat Sheet
No ratings yet
Master Method Cheat Sheet
2 pages
Complexity Analysis - Difficult Recurrences: Example 1: The Fibonacci Recurrence
No ratings yet
Complexity Analysis - Difficult Recurrences: Example 1: The Fibonacci Recurrence
4 pages
Veermata Jijabai Technological Institute
No ratings yet
Veermata Jijabai Technological Institute
1 page
09A On The History of The Shortest Path Problem
No ratings yet
09A On The History of The Shortest Path Problem
5 pages
Seven Segment Display Description
No ratings yet
Seven Segment Display Description
8 pages
Python Part 1 Notes
No ratings yet
Python Part 1 Notes
6 pages
DMS 73277 May-2024
No ratings yet
DMS 73277 May-2024
4 pages

Data Compression

Uploaded by

Data Compression

Uploaded by

1/8/23

2 Lossless Compression Methods

3 Lossy Compression Methods

§ Transmitting faster, decreasing access time

§ Processing faster sequentially

Why Data Compression?

v Save time and help to optimize resources

§ If compression and decompression are done in I/O

§ In sending data over communication line: less time

Data Compression- Entropy

v Messages with higher entropy carry more information than

v How to determine the entropy:

v The average entropy over the entire message is the sum of

Data Compression Methods

v There’re two major categories for methods to compress

Lossless Compression Methods

v Redundant data is removed in compression and added

v Lossless methods are used when we can’t afford to lose

v Replace consecutive repeating occurrences of a symbol

v There’s no unique Huffman code and every Huffman

2. Take the two leaf nodes with the smallest probability

3. If there is only one node left, the code construction

v We will look at LZ78

v Commands zip and unzip and Unix compress and

§ We have index values from 1 to n

§ Since each piece is the concatenation of a piece

§ Encode the file containing the following

Lempel Ziv Encoding

§ If both sender and receiver have a copy of the dictionary,

Lempel Ziv Compression

Lempel Ziv Compression

Lempel Ziv Decompression

Lossy Compression Methods

v These methods are cheaper, less time and space.

Lossy Compression Methods

Lossy Compression Methods

JPEG Encoding- DCT

Quantization & Compression

§ Quantization divides the number of bits by a constant, then drops the

Quantization & Compression

§ To cluster the 0s together, the table is read diagonally in an zigzag

§ JPEG usually uses lossless run-length encoding at the compression

v Each video is a rapid sequence of a set of frames.

v Each frame is a spatial combination of pixels, or a picture.

vTwo categories of techniques:

v Perceptual Encoding: MP3

You might also like