0% found this document useful (0 votes)
7 views

Data Compression

Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Data Compression

Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 19

Data Compression

What is an image?
Why Data Compression?
• Make optimal use of limited storage space
• Save time and help to optimize resources
 If compression and decompression are done in I/O processor, less
time is required to move data to or from storage subsystem,
freeing I/O bus for other work
 In sending data over communication line: less time to transmit
and less storage to host
OR
 Reduce the memory required for storage
 Improve the data access rate from storage device and
 Reduce the bandwidth and/or the time required for transfer
across communication channels.
Image Redundancies
Redundancy refers to the amount of wasted space consumed by storage
media to record picture information in a digital image.
Image compression is achieved by exploiting redundancies in the image.
These redundancies could be spatial, spectral, or temporal redundancy.
Spatial redundancy: elements that are duplicated within a
structure, such as neighboring pixels in a still image. Exploiting spatial
redundancy is how compression is performed.
Spectral redundancy is due to correlation between different color
planes.
Temporal redundancy: pixels in two video frames that have the same
values in the same location. It is due to correlation between different
frames in a sequence of images such as in videoconferencing
applications in broadcast images. Exploiting temporal redundancy is one
of the primary techniques in video compression.
Data Compression Methods
• Data compression is about storing and sending a smaller
number of bits.
• There’re two major categories for methods to compress
data: lossless and lossy methods
Lossless Compression Methods
• In lossless methods, original data and the data after
compression and decompression are exactly the
same.

• Redundant data is removed in compression and


added during decompression.

• Lossless methods are used when we can’t afford to


lose any data: legal and medical documents,
computer programs.
Run-length encoding
• Simplest method of compression.
• How: replace consecutive repeating occurrences of a symbol by 1
occurrence of the symbol itself, then followed by the number of
occurrences.

• The method can be more efficient if the data uses only 2 symbols
(0s and 1s) in bit patterns and 1 symbol is more frequent than
another.
Huffman Coding
Assign fewer bits to symbols that occur more frequently and
more bits to symbols appear less often.
There’s no unique Huffman code and every Huffman code has
the same average code length.
Algorithm:
1. Make a leaf node for each code symbol
Add the generation probability of each symbol to the leaf node
2. Take the two leaf nodes with the smallest probability and connect
them into a new node
Add 1 or 0 to each of the two branches
The probability of the new node is the sum of the probabilities of
the two connecting nodes
3. If there is only one node left, the code construction is completed. If
not, go back to (2)
Huffman Coding
• Example
Huffman Coding
• Encoding

• Decoding
Lossy Compression Methods
• Used for compressing images and video files (our
eyes cannot distinguish subtle changes, so lossy data
is acceptable).
• These methods are cheaper, less time and space.
• Several methods:
 JPEG: compress pictures and graphics
 MPEG: compress video
 MP3: compress audio
JPEG Encoding
• Used to compress pictures and graphics.
• In JPEG, a grayscale picture is divided into 8x8
pixel blocks to decrease the number of
calculations.
• Basic idea:
 Change the picture into a linear (vector) sets of numbers
that reveals the redundancies.
 The redundancies is then removed by one of lossless
compression methods.
JPEG Encoding- DCT
• DCT: Discrete Concise Transform
• DCT transforms the 64 values in 8x8 pixel block in a way that
the relative relationships between pixels are kept but the
redundancies are revealed.
• Example:
A gradient grayscale
Quantization & Compression
Quantization
 After T table is created, the values are quantized to reduce
the number of bits needed for encoding.
 Quantization divides the number of bits by a constant, then
drops the fraction. This is done to optimize the number of bits
and the number of 0s for each particular application.

Compression
 Quantized values are read from the table and redundant 0s
are removed.
 To cluster the 0s together, the table is read diagonally in an
zigzag fashion. The reason is if the table doesn’t have fine
changes, the bottom right corner of the table is all 0s.
 JPEG usually uses lossless run-length encoding at the
compression phase.
JPEG Encoding
MPEG Encoding
• Used to compress video.
• Basic idea:
 Each video is a rapid sequence of a set of frames. Each
frame is a spatial combination of pixels, or a picture.
 Compressing video =
spatially compressing each frame
+
temporally compressing a set of frames.
MPEG Encoding
• Spatial Compression
 Each frame is spatially compressed by JPEG.

• Temporal Compression
 Redundant frames are removed.
 For example, in a static scene in which someone is talking, most
frames are the same except for the segment around the speaker’s
lips, which changes from one frame to the next.
Audio Compression
• Used for speech or music
 Speech: compress a 64 kHz digitized signal
 Music: compress a 1.411 MHz signal

• Two categories of techniques:


 Predictive encoding
 Perceptual encoding
Audio Encoding
Predictive Encoding
 Only the differences between samples are encoded, not
the whole sample values.
 Several standards: GSM (13 kbps), G.729 (8 kbps), and
G.723.3 (6.4 or 5.3 kbps)

Perceptual Encoding: MP3


 CD-quality audio needs at least 1.411 Mbps and cannot
be sent over the Internet without compression.
 MP3 (MPEG audio layer 3) uses perceptual encoding
technique to compress audio.

You might also like