Multimedia Unit-4
Multimedia Unit-4
Data compression can be performed by using smaller strings of bits (0s and 1s)
in place of the original string and using a ‘dictionary’ to decompress the data if
required. Other techniques include the introduction of pointers (references) to
a string of bits that the compression program has become familiar with or
removing redundant characters.
1. Lossy
2. Lossless
Lossy compression
To understand the lossy compression technique, we must first understand the
difference between data and information. Data is a raw, often unorganized
collection of facts or values and can mean numbers, text, symbols, etc. On the
other hand, Information brings context by carefully organizing the facts.
To put this in context, a black and white image of 4×6 inches in 100 dpi (dots per
inch) will have 2,40,000 pixels. Each of these pixels contains data in the form of
a number between 0 to 255, representing pixel density (0 being black and 255
being white).
This image as a whole can have some information like it is a picture of the
16th president of the USA- Abraham Lincoln. If we display an image in 50 dpi,
i.e., in 60,000 pixels, the data required to save the image will reduce, and
perhaps the quality too, but the information will remain intact. Only after
considerable loss in data, we can lose the information. Below is an explanation of
how it works.
With the above understanding of the difference between data and information,
we now can comprehend Lossy compression. As the name suggests, Lossy
compression loses data, i.e., gets rid of it to reduce the size of the data.
• Advantage:
The advantage of lossy compression is that it’s relatively quick, can reduce the
file size dramatically, and the user can select the compression level. It is
beneficial for compressing data like images, video, and even audio by taking
advantage of the limitation of the human sense. This is because of the limit of
our eyes and ears as they cannot perceive a difference in the quality of an image
and audio before a certain point.
• Disadvantage:
Lossless Compression
Lossless compression, unlike lossy compression, doesn’t remove any data; instead,
it transforms it to reduce its size. To understand the concept, we can take a
simple example.
There is a piece of text where the word ‘because’ is repeated quite often. The
term is comprised of seven letters, and by using a shorthand or abbreviated
version of it like ‘bcz’, we can transform the text. This information of replacing
‘because’ with ‘bcz’ can be stored in a dictionary for later use (during
decompression).
• Advantage:
There are types of data where lossy compression is not feasible. For example, in
a spreadsheet, software, program, or any data comprised of factual text or
numbers, lossy cannot work as every number might be essential and can’t be
considered redundant as any reduction will immediately cause loss of information.
Here lossless compression becomes crucial as, upon decompression, the file can
be restored to its original state without losing any data.
• Disadvantage:
3. Increases the speed of transferring files through the internet and other
networks.
4. Deflate
5. Content Mixing
6. Huffman Encoding
9. Arithmetic Encoding
11. ZStandard
12. Bzip2 (Burrows and Wheeler)
1. Transform coding
4. Fractal Compression
Some neural network-based models are also used for compression, such as-
This is all about data compression techniques. If you have any questions on how
these models function, we are happy to help. Meanwhile, here are a few
commonly asked questions on data compression techniques.
Run-length coding
Output:
6A5B8C2D5E3F
Compression Efficiency:
Implementation in Multimedia:
1. Images: RLC is commonly used in formats like BMP, TIFF, and fax
transmissions. For example, in an image with a large region of the
same color, RLC encodes this region by storing the color and the
number of pixels with that color in a row.
2. Video: RLC is often used in video compression as part of
algorithms like MPEG and H.264, specifically in encoding parts of
frames with minimal changes.
3. Audio: RLC is less common in audio but can be applied to simplify
sequences of silence or consistent background noise.
Applications:
For an image row with pixel values: [0, 0, 0, 255, 255, 255, 255, 0, 0], RLC would
encode it as:
Disadvantages
Variable Length Coding works by assigning a unique code to each data symbol
based on its frequency. In the simplest form, the code can be a binary string,
where the most common symbols are assigned shorter codes, and less frequent
symbols are assigned longer codes. This is called Huffman coding, which is a
widely used technique for data compression.
The basic steps involved in the Huffman coding algorithm are as follows:
2. Symbol Sorting: The symbols are sorted based on their frequency, with the
most frequent symbol placed at the top.
3. Tree Creation: A binary tree is created, where each leaf node represents a
symbol and its frequency, and each internal node represents the sum of the
frequencies of its child nodes.
4. Code Assignment: The codes are assigned to the symbols based on their
position in the binary tree. The code assigned to a symbol is obtained by
traversing the tree from the root to the leaf node representing the symbol, and
appending a 0 or 1 depending on whether the left or right branch is taken at
each internal node.
5. Data Encoding: The input data is then encoded using the assigned codes,
where each symbol is replaced by its corresponding code.
Let's consider an example where we want to encode the following message using
Huffman coding:
"ABBCCCDDDD"
Symbol Frequency
A 1
B 2
C 3
D 4
Symbol Frequency
D 4
C 3
B 2
A 1
The tree is created by repeatedly combining the two lowest frequency symbols
until a single root node is obtained.
The codes are assigned to the symbols based on their position in the binary tree:
Symbol Code
D 0
C 10
B 110
A 111
ABBCCCDDDD
11011110100
2. Tailored Codes: Variable length coding can tailor codes to the statistical
properties of the data, resulting in shorter codes for more frequent symbols
and longer codes for less frequent symbols.
3. Variable Length Codes: Variable length codes may not be suitable for all
applications, as they may not be compatible with certain hardware or software
systems that require fixed length codes.
1. Image and Video Compression: Variable length coding is widely used in image
and video compression standards such as JPEG, MPEG, and H.264 to reduce the
data rate and improve the storage and transmission efficiency.
2. Voice and Speech Coding: Variable length coding is also used in voice and
speech coding standards such as G.711, G.729, and AMR to compress audio data
and improve the bandwidth efficiency of communication systems.
The dictionary can be created using various methods such as statistical analysis,
machine learning algorithms, or manual construction. Once the dictionary is
created, the compression process begins by scanning the data for occurrences
of phrases in the dictionary. When a match is found, the phrase is replaced with
a shorter code or symbol from the dictionary.
2. Fast Decompression: Since the dictionary is shared between the encoder and
decoder, decompression can be performed quickly and efficiently. This makes
dictionary-based coding well-suited for applications that require real-time data
compression and decompression, such as video and audio streaming.
1. Dictionary Overhead: The size of the dictionary can have a significant impact
on the compression ratio and the compression speed. A larger dictionary can
capture more patterns in the data, but it also requires more memory and
processing power. A smaller dictionary, on the other hand, may not capture all
the patterns in the data, resulting in lower compression ratios.
2. Compression Speed: The time required to create the dictionary and perform
the compression can be a significant bottleneck in some applications. This can be
particularly problematic for applications that require real-time compression,
such as video and audio streaming.
Transform Coding
⚫ Common Transforms:
⚫ Coefficient Quantization:
• Image Compression: JPEG uses the DCT to compress images, dividing the
image into small blocks (usually 8x8 pixels), applying the DCT, and
quantizing the coefficients. The result is a compressed image with
reduced file size but minimal visible quality loss.
• Video Compression: Video codecs, like MPEG, H.264, and HEVC, apply
transform coding to each frame (often using DCT or variations). They
also exploit similarities between consecutive frames, further enhancing
compression.
• Audio Compression: In audio codecs like MP3 and AAC, transform coding
is used to discard inaudible frequencies, achieving high compression with
minimal impact on perceived quality.
Step 1: The input image is divided into a small block which is having 8x8
dimensions. This dimension is sum up to 64 units. Each unit of the image is called
pixel.
Step 2: JPEG uses [Y,Cb,Cr] model instead of using the [R,G,B] model. So in the
2nd step, RGB is converted into YCbCr.
DCT Formula
Step 4: Humans are unable to see important aspects of the image because they
are having high frequencies. The matrix after DCT conversion can only preserve
values at the lowest frequency that to in certain point. Quantization is used to
reduce the number of bits per sample.
1. Uniform Quantization
2. Non-Uniform Quantization
Step 5: The zigzag scan is used to map the 8x8 matrix to a 1x64 vector. Zigzag
scanning is used to group low-frequency coefficients to the top level of the
vector and the high coefficient to the bottom. To remove the large number of
zero in the quantized matrix, the zigzag matrix is used.
Step 6: Next step is vectoring, the different pulse code modulation (DPCM) is
applied to the DC component. DC components are large and vary but they are
usually close to the previous value. DPCM encodes the difference between the
current block and the previous block.
JPEG 4 Modes
1. Lossy Compression
JPEG's standard form of compression is lossy, meaning that some image data is
permanently discarded to achieve a smaller file size. Here’s how it works in
detail:
• The first step in JPEG compression involves dividing the image into small
blocks, typically 8x8 pixels.
• Each block is transformed from the spatial domain (pixel values) into the
frequency domain using a mathematical function called the Discrete
Cosine Transform.
• The DCT separates the image data into different frequency components:
o Low frequencies contain important structural and smooth gradient
information.
o High frequencies represent finer details and noise.
• The goal of this transformation is to concentrate the image's
information in fewer coefficients, making it easier to identify and
discard less important data later in the process.
B. Quantization
C. Entropy Coding
• The final step in JPEG lossy compression is entropy coding, which further
compresses the quantized coefficients using lossless methods like:
o Huffman Coding: Encodes frequently occurring values with
shorter codes and less common values with longer codes.
o Arithmetic Coding: An alternative to Huffman coding that
provides slightly better compression efficiency by using more
complex encoding algorithms.
• Entropy coding reduces the redundancy in the data, resulting in the final
compressed JPEG file.
Baseline JPEG is the most common and simplest mode of JPEG compression. It is
characterized by:
• Sequential Encoding: The image data is encoded and transmitted one row
at a time, from top to bottom.
• Compatibility: Baseline JPEG is supported by virtually all web browsers
and image viewers, making it a standard for most applications.
Video Compression
Implementing effective video compression into your digital platforms requires
knowledge of the most common codecs, digital formats, and optimization
strategies.
Video compression is the process of reducing the file size of a video file by
reducing the amount of data needed to represent its content, but without losing
too much visual information. It's what powers every internet video streaming
platform, including YouTube, Netflix, and TikTok. Video conferencing services
like Facetime and Google Meet also compress video on the fly, with both maxing
out videos to just 1080p.
Even the clearest, most high-resolution video you watch via the internet has
been compressed, otherwise you might not be able to stream it.
• Reducing storage costs: Companies that host video deal with overhead
for storage, and compression helps them reduce that.
Video compression works by using a program called a codec to get rid of the
redundant and/or imperceptible elements of a video. This is how it reduces the
size of the file without affecting the quality of the footage. The difference
between the original and compressed video's size is called the compression ratio.
There are two video compression algorithms that are in popular use today:
intraframe and interframe compression.
Take a video where the camera is on a tripod and the speaker is seated in front
of a full-wall mural. Since only the speaker is moving and the background isn't,
it's possible to save only the elements that change and reuse the same version
of the rest across multiple frames of video. This is how temporal compression
gets rid of redundant details.
If two or more adjacent pixels represent nearly identical colors, it's possible to
combine them into one color without it being detectable to the human eye. This
process is called chroma subsampling, and it's an example of how compression
gets rid of imperceptible details.
Other areas where video compression reduces details are in the file's frequency,
dynamic range, and frame rate. For example, if a video was shot at 60 frames
per second, compression could easily bring that down to 24 without a significant
loss of quality.
H.261
o The macroblocks are further divided into smaller 8x8 blocks, and
each block undergoes the DCT. This transforms spatial pixel data
into frequency domain data, concentrating most of the visual
information into a few coefficients.
o Quantization: The DCT coefficients are quantized to reduce
precision and achieve compression. High-frequency coefficients,
which represent fine details, are often reduced or discarded.
o Quantization introduces data loss but significantly reduces the
data size, making the compression more efficient. The degree of
quantization can be adjusted based on desired compression levels.
⚫ Entropy Coding
H.263
• Macroblocks and Block Structures: H.263 divides each video frame into
macroblocks (16x16 pixels), which are then further subdivided into
smaller 8x8 blocks for more efficient processing.
• Intra and Inter Prediction:
• 8x8 DCT Blocks: Like H.261, each macroblock in H.263 is divided into
8x8 blocks, and the DCT is applied to convert the spatial domain data into
the frequency domain.
• Quantization: The DCT coefficients are quantized to reduce precision,
achieving compression by discarding less significant information (typically
higher frequencies). This step is lossy, contributing to compression
efficiency but leading to data loss.
D. Entropy Coding
H.263 was widely adopted for early internet video applications, mobile video
telephony, and streaming services. It played a significant role in low-bitrate
video compression during the early days of online multimedia and paved the way
for subsequent video standards such as H.264 (AVC), which borrowed many
concepts from H.263 but introduced significant improvements in efficiency and
versatility.
MPEG-1 −This standard is primarily used for audio and video compression for
CD-ROMs and low-quality video on the internet.
MPEG-2 −This standard is used for digital television and DVD video, as well as
high-definition television (HDTV).
MPEG-4 −This standard is used for a wide range of applications, including video
on the internet, mobile devices, and interactive media.
MPEG-7 −This standard is used for the description and indexing of audio and
video content.
MPEG uses a lossy form of compression, which means that some data is lost
when the audio or video is compressed. The degree of compression can be
adjusted, with higher levels of compression resulting in smaller file sizes but
lower quality, and lower levels of compression resulting in larger file sizes but
higher quality.
Advantage of MPEG
Widely supported − MPEG is a widely used and well-established audio and video
format, and it is supported by a wide range of media players, video editors, and
other software.
Good quality − While MPEG uses lossy compression, it can still produce good
quality audio and video at moderate to high compression levels.
Versatile − MPEG can be used with a wide range of audio and video types,
including music, movies, television shows, and other types of multimedia content.
Streamable − MPEG files can be streamed over the internet, making it easy to
deliver audio and video content to a wide audience.
Scalable − MPEG supports scalable coding, which allows a single encoded video to
be adapted to different resolutions and bitrates. This makes it well-suited for
use in applications such as video-on-demand and live streaming.
Disadvantage of MPEG
Lossy compression − Because MPEG uses lossy compression, some data is lost
when the audio or video is compressed. This can result in some loss of quality,
particularly at higher levels of compression.
Limited color depth − Some versions of MPEG have a limited color depth and can
only support 8 bits per channel. This can result in visible banding or other
artifacts in videos with high color gradations or smooth color transitions.
Non-ideal for text and graphics − MPEG is not well suited for video with sharp
transitions, high-contrast text, or graphics with hard edges. These types of
video can appear pixelated or jagged when saved as MPEG.
Complexity − The MPEG standards are complex and require specialized software
and hardware to encode and decode audio and video.
Patent fees − Some MPEG standards are covered by patents, which may require
the payment of licensing fees to use the technology.
Compatibility issues − Some older devices and software may not support newer
versions of the MPEG standard.
How big is a MPEG file, in bytes?
The size of a MPEG file depends on the length, resolution, and complexity of the
audio or video, as well as the level of compression used. In general, MPEG files
can range in size from a few hundred kilobytes to several gigabytes.
It's worth noting that the size of a MPEG file can be reduced by increasing the
level of compression or by resizing the video to a smaller resolution. However,
increasing the level of compression or resizing the video can also result in a loss
of quality.