Data Compression
Data Compression
What is an image?
Why Data Compression?
• Make optimal use of limited storage space
• Save time and help to optimize resources
If compression and decompression are done in I/O processor, less
time is required to move data to or from storage subsystem,
freeing I/O bus for other work
In sending data over communication line: less time to transmit
and less storage to host
OR
Reduce the memory required for storage
Improve the data access rate from storage device and
Reduce the bandwidth and/or the time required for transfer
across communication channels.
Image Redundancies
Redundancy refers to the amount of wasted space consumed by storage
media to record picture information in a digital image.
Image compression is achieved by exploiting redundancies in the image.
These redundancies could be spatial, spectral, or temporal redundancy.
Spatial redundancy: elements that are duplicated within a
structure, such as neighboring pixels in a still image. Exploiting spatial
redundancy is how compression is performed.
Spectral redundancy is due to correlation between different color
planes.
Temporal redundancy: pixels in two video frames that have the same
values in the same location. It is due to correlation between different
frames in a sequence of images such as in videoconferencing
applications in broadcast images. Exploiting temporal redundancy is one
of the primary techniques in video compression.
Data Compression Methods
• Data compression is about storing and sending a smaller
number of bits.
• There’re two major categories for methods to compress
data: lossless and lossy methods
Lossless Compression Methods
• In lossless methods, original data and the data after
compression and decompression are exactly the
same.
• The method can be more efficient if the data uses only 2 symbols
(0s and 1s) in bit patterns and 1 symbol is more frequent than
another.
Huffman Coding
Assign fewer bits to symbols that occur more frequently and
more bits to symbols appear less often.
There’s no unique Huffman code and every Huffman code has
the same average code length.
Algorithm:
1. Make a leaf node for each code symbol
Add the generation probability of each symbol to the leaf node
2. Take the two leaf nodes with the smallest probability and connect
them into a new node
Add 1 or 0 to each of the two branches
The probability of the new node is the sum of the probabilities of
the two connecting nodes
3. If there is only one node left, the code construction is completed. If
not, go back to (2)
Huffman Coding
• Example
Huffman Coding
• Encoding
• Decoding
Lossy Compression Methods
• Used for compressing images and video files (our
eyes cannot distinguish subtle changes, so lossy data
is acceptable).
• These methods are cheaper, less time and space.
• Several methods:
JPEG: compress pictures and graphics
MPEG: compress video
MP3: compress audio
JPEG Encoding
• Used to compress pictures and graphics.
• In JPEG, a grayscale picture is divided into 8x8
pixel blocks to decrease the number of
calculations.
• Basic idea:
Change the picture into a linear (vector) sets of numbers
that reveals the redundancies.
The redundancies is then removed by one of lossless
compression methods.
JPEG Encoding- DCT
• DCT: Discrete Concise Transform
• DCT transforms the 64 values in 8x8 pixel block in a way that
the relative relationships between pixels are kept but the
redundancies are revealed.
• Example:
A gradient grayscale
Quantization & Compression
Quantization
After T table is created, the values are quantized to reduce
the number of bits needed for encoding.
Quantization divides the number of bits by a constant, then
drops the fraction. This is done to optimize the number of bits
and the number of 0s for each particular application.
Compression
Quantized values are read from the table and redundant 0s
are removed.
To cluster the 0s together, the table is read diagonally in an
zigzag fashion. The reason is if the table doesn’t have fine
changes, the bottom right corner of the table is all 0s.
JPEG usually uses lossless run-length encoding at the
compression phase.
JPEG Encoding
MPEG Encoding
• Used to compress video.
• Basic idea:
Each video is a rapid sequence of a set of frames. Each
frame is a spatial combination of pixels, or a picture.
Compressing video =
spatially compressing each frame
+
temporally compressing a set of frames.
MPEG Encoding
• Spatial Compression
Each frame is spatially compressed by JPEG.
• Temporal Compression
Redundant frames are removed.
For example, in a static scene in which someone is talking, most
frames are the same except for the segment around the speaker’s
lips, which changes from one frame to the next.
Audio Compression
• Used for speech or music
Speech: compress a 64 kHz digitized signal
Music: compress a 1.411 MHz signal