0% found this document useful (0 votes)
39 views30 pages

Multimedia Unit-4

Multimedia notes of 4th unit

Uploaded by

Khushi Chouhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views30 pages

Multimedia Unit-4

Multimedia notes of 4th unit

Uploaded by

Khushi Chouhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Sanatan Dharma College, Ambala Cantt

B.C.A(3rd Year) Ms. Garima Sudan


Assistant Professor
Multimedia and tools
Unit -4

What is data compression technique?

Data compression techniques in digital communication refer to the use of


specific formulas and carefully designed algorithms used by a compression
software or program to reduce the size of various kinds of data. There are
particular types of such techniques that we will get into, but to have an overall
understanding, we can focus on the principles.

Data compression can be performed by using smaller strings of bits (0s and 1s)
in place of the original string and using a ‘dictionary’ to decompress the data if
required. Other techniques include the introduction of pointers (references) to
a string of bits that the compression program has become familiar with or
removing redundant characters.

For a video, compression can be achieved by skipping every 3rd frame, as


this will result (as one can imagine) in a 1/3 reduction in the size of the
file. All such compression can dramatically reduce data size (in cases up
to 70% or more without losing any significant data). Compression formats
like ZIP, GZIP, etc., are used when transferring data via the internet.

The use of data compression techniques in digital communication greatly helps in


reducing the time for a file transfer, the cost of storage, and traffic in the
network.

Types of data compression techniques

The two common types that always stand out are:

1. Lossy

2. Lossless

Lossy compression
To understand the lossy compression technique, we must first understand the
difference between data and information. Data is a raw, often unorganized
collection of facts or values and can mean numbers, text, symbols, etc. On the
other hand, Information brings context by carefully organizing the facts.

To put this in context, a black and white image of 4×6 inches in 100 dpi (dots per
inch) will have 2,40,000 pixels. Each of these pixels contains data in the form of
a number between 0 to 255, representing pixel density (0 being black and 255
being white).

This image as a whole can have some information like it is a picture of the
16th president of the USA- Abraham Lincoln. If we display an image in 50 dpi,
i.e., in 60,000 pixels, the data required to save the image will reduce, and
perhaps the quality too, but the information will remain intact. Only after
considerable loss in data, we can lose the information. Below is an explanation of
how it works.

With the above understanding of the difference between data and information,
we now can comprehend Lossy compression. As the name suggests, Lossy
compression loses data, i.e., gets rid of it to reduce the size of the data.

Advantages and disadvantages of Lossy Compression

• Advantage:

The advantage of lossy compression is that it’s relatively quick, can reduce the
file size dramatically, and the user can select the compression level. It is
beneficial for compressing data like images, video, and even audio by taking
advantage of the limitation of the human sense. This is because of the limit of
our eyes and ears as they cannot perceive a difference in the quality of an image
and audio before a certain point.

• Disadvantage:

The disadvantage of lossy is that decompression of data compressed through


lossy will not return the same data (in terms of quality, size, etc.). Still, it will
hold similar information (this, in fact, is useful in some instances, such as
streaming or downloading content on the internet). However, on the flip side,
constant downloading and uploading of a file can compress and consequently
distort it beyond the point of recognition, causing permanent information loss.
Similarly, if a severe level of compression is used by the user, then the output
file might not be anywhere close to the original input file.

Lossless Compression

Lossless compression, unlike lossy compression, doesn’t remove any data; instead,
it transforms it to reduce its size. To understand the concept, we can take a
simple example.

There is a piece of text where the word ‘because’ is repeated quite often. The
term is comprised of seven letters, and by using a shorthand or abbreviated
version of it like ‘bcz’, we can transform the text. This information of replacing
‘because’ with ‘bcz’ can be stored in a dictionary for later use (during
decompression).

• Methodology: While lossy compression removes redundant or


unnoticeable pieces of data to reduce the size, lossless compression
transforms it through encoding it by using some formula or logic. Here’s
how lossless compression works.
Advantages and disadvantages of Lossless Compression

• Advantage:

There are types of data where lossy compression is not feasible. For example, in
a spreadsheet, software, program, or any data comprised of factual text or
numbers, lossy cannot work as every number might be essential and can’t be
considered redundant as any reduction will immediately cause loss of information.
Here lossless compression becomes crucial as, upon decompression, the file can
be restored to its original state without losing any data.

• Disadvantage:

There is a limit to data compression. If data is already compressed, then


compressing it again will result in little to no reduction in its size. Also, it is less
effective against larger file sizes.

Data Compression Techniques: Advantages and Disadvantages

There are several advantages of using the different data compression


techniques discussed above, such as-

1. Reduces the disk space occupied by the file.

2. Reading and Writing of files can be done quickly.

3. Increases the speed of transferring files through the internet and other
networks.

Even with a range of advantages of the data compression techniques, there is a


tradeoff as a cost is always associated with the compression of a file. This cost
results in certain disadvantages such as-

1. The processing time taken by complex data compression algorithms can


be very high, especially if the data in question is large.

2. Certain compression algorithms are resource-intensive and may cause the


machine to go out of memory.

3. There is a dependency on software that decompresses compressed files.

4. The associated cost of compression can be monetary also, with certain


software requiring you to pay licensing fee.

5. Incompatibility issues can occur during decompression processes.

6. Any error occurred during the transmission of compressed data can


cause significant information loss.
Data Compression Technique Model

Let’s say, you refer to a research paper or a technical data compression


techniques pdf. In that case, you will find numerous types of data compression
models that use different compression algorithms pertaining to the two
compression techniques discussed above.

Following are the most common data compression models-

Lossless compression technique models

The most common models based on lossless technique are-

1. RLE (Run Length Encoding)

2. Dictionary Coder (LZ77, LZ78, LZR, LZW, LZSS, LZMA, LZMA2)

3. Prediction by Partial Matching (PPM)

4. Deflate

5. Content Mixing

6. Huffman Encoding

7. Adaptive Huffman Coding

8. Shannon Fano Encoding

9. Arithmetic Encoding

10. Lempel Ziv Welch Encoding

11. ZStandard
12. Bzip2 (Burrows and Wheeler)

Lossy compression technique models-

The most common models based on the lossy technique are-

1. Transform coding

2. Discrete Cosine Transform

3. Discrete Wavelet Transform

4. Fractal Compression

Neural network-based models

Some neural network-based models are also used for compression, such as-

1. Multilayer Perceptron (MLP) based compression (used for image


compression)

2. Convolutional Neural Network (CNN) based compression such as Deep


Coder (used for video compression)

3. Generative AdversualNetwork (GAN) based compression (used for real-


time compression)

This is all about data compression techniques. If you have any questions on how
these models function, we are happy to help. Meanwhile, here are a few
commonly asked questions on data compression techniques.

Run-length coding

Run-length encoding (RLE) is a very basic type of data compression in which an


inputs stream of data (such as "PPPRRRSSSSSSS") is provided, and the output
is a series of counts of successive data values in a row (i.e. "3P3R7S"). Since
there is no loss in this type of data compression, the original data can be fully
retrieved upon decompression. The algorithm's flexibility in both the encoding
(compression) and decoding (decompression) processes is one of its most
appealing qualities.
This type of data compression(Encoding) is lossless, meaning that all of the
original data is recovered when decoded. One of the algorithm's most attractive
aspects is its easiness in both encodings (compression) and decoding
(decompression).
Example
Here's a simple example of a data stream ("run") in its original and encoded
forms:
Input:
AAAAAABBBBBCCCCCCCCDDEEEEEFFF

Output:
6A5B8C2D5E3F

In this example, we have the data given as input of 29 characters. The


characters are then compressed into 12 characters which is a function of RLE.

Key Aspects of Run-Length Coding in Multimedia

Compression Efficiency:

1. RLC is particularly effective in compressing images or audio with


large areas of uniform color or sound, such as cartoons, scanned
text, and simple graphic images with large sections of the same
color.
2. It’s less effective for images or audio with high levels of detail or
variations, where data values change frequently, reducing the
length of runs.

Implementation in Multimedia:

1. Images: RLC is commonly used in formats like BMP, TIFF, and fax
transmissions. For example, in an image with a large region of the
same color, RLC encodes this region by storing the color and the
number of pixels with that color in a row.
2. Video: RLC is often used in video compression as part of
algorithms like MPEG and H.264, specifically in encoding parts of
frames with minimal changes.
3. Audio: RLC is less common in audio but can be applied to simplify
sequences of silence or consistent background noise.

Applications:

1. Fax and Scanned Images: Fax images and certain types of


scanned documents often consist of large white and black regions,
making them ideal for RLC.
2. Lossless Compression: Since RLC simply encodes data without
removing it, it’s suitable for lossless compression, where quality
cannot be compromised.
3. Embedded Systems: Its low complexity makes RLC suitable for
hardware-based or resource-limited systems, such as embedded
devices for image storage and transmission.
Example of Run-Length Coding

For an image row with pixel values: [0, 0, 0, 255, 255, 255, 255, 0, 0], RLC would
encode it as:

• [0, 3] (three zeros)


• [255, 4] (four 255s)
• [0, 2] (two zeros)

Thus, the original sequence of 9 values is reduced to just 6 values,


demonstrating RLC's efficiency in this context.

Advantages of RLC in Multimedia

• Simplicity: RLC is computationally simple, making it easy to implement and


execute, even on low-power devices.
• No Data Loss: RLC is lossless, so it doesn’t sacrifice quality.

Disadvantages

• Inefficiency for Complex Data: For complex or noisy data, where


consecutive values don’t repeat, RLC can even increase file size instead of
reducing it.
• Limited Scope: RLC is mostly useful for specific data types like simple
images and not suited for complex multimedia formats with high
variability.

In multimedia, RLC is typically combined with other compression techniques to


optimize results, especially for more detailed media.

Variable length coding

Variable Length Coding (VLC) is a technique used in digital communication


and data compression, which is designed to represent information using fewer
bits than their original representation. This is achieved by assigning shorter
codes to more frequent data symbols, and longer codes to less frequent symbols.
The technique has been widely used in digital communication and data
compression, particularly in image and video coding, where the high redundancy
of data requires efficient coding techniques to reduce the data rate.

How Variable Length Coding Works

Variable Length Coding works by assigning a unique code to each data symbol
based on its frequency. In the simplest form, the code can be a binary string,
where the most common symbols are assigned shorter codes, and less frequent
symbols are assigned longer codes. This is called Huffman coding, which is a
widely used technique for data compression.

The basic steps involved in the Huffman coding algorithm are as follows:

1. Frequency Calculation: The frequency of occurrence of each symbol in the


input data stream is calculated.

2. Symbol Sorting: The symbols are sorted based on their frequency, with the
most frequent symbol placed at the top.

3. Tree Creation: A binary tree is created, where each leaf node represents a
symbol and its frequency, and each internal node represents the sum of the
frequencies of its child nodes.

4. Code Assignment: The codes are assigned to the symbols based on their
position in the binary tree. The code assigned to a symbol is obtained by
traversing the tree from the root to the leaf node representing the symbol, and
appending a 0 or 1 depending on whether the left or right branch is taken at
each internal node.

5. Data Encoding: The input data is then encoded using the assigned codes,
where each symbol is replaced by its corresponding code.

Example of Variable Length Coding

Let's consider an example where we want to encode the following message using
Huffman coding:

"ABBCCCDDDD"

Step 1: Frequency Calculation

Symbol Frequency
A 1
B 2
C 3
D 4

Step 2: Symbol Sorting

Symbol Frequency
D 4
C 3
B 2
A 1

Step 3: Tree Creation

The tree is created by repeatedly combining the two lowest frequency symbols
until a single root node is obtained.

Step 4: Code Assignment

The codes are assigned to the symbols based on their position in the binary tree:

Symbol Code
D 0
C 10
B 110
A 111

Step 5: Data Encoding

The input message is then encoded using the assigned codes:

ABBCCCDDDD
11011110100

Advantages of Variable Length Coding

1. Efficient Data Compression: Variable length coding is an efficient data


compression technique that can significantly reduce the data rate without loss
of information.

2. Tailored Codes: Variable length coding can tailor codes to the statistical
properties of the data, resulting in shorter codes for more frequent symbols
and longer codes for less frequent symbols.

3. Error Resilience: Variable length coding is more resilient to transmission


errors than fixed length coding. In case of a transmission error, only a few bits
are affected, resulting in a smaller error impact compared to fixed length
coding.

Disadvantages of Variable Length Coding

1. Complexity: The process of constructing the Huffman tree and assigning


codes can be computationally expensive for large datasets, making it impractical
for real-time applications.
2. Overhead: Variable length coding introduces additional overhead in the form
of codebook or dictionary, which is required to decode the data.

3. Variable Length Codes: Variable length codes may not be suitable for all
applications, as they may not be compatible with certain hardware or software
systems that require fixed length codes.

4. Sensitivity to Data: Variable length coding is highly sensitive to the


statistical properties of the data, and may not perform well for data with highly
variable or unpredictable distributions.

Applications of Variable Length Coding

1. Image and Video Compression: Variable length coding is widely used in image
and video compression standards such as JPEG, MPEG, and H.264 to reduce the
data rate and improve the storage and transmission efficiency.

2. Voice and Speech Coding: Variable length coding is also used in voice and
speech coding standards such as G.711, G.729, and AMR to compress audio data
and improve the bandwidth efficiency of communication systems.

3. Text Compression: Variable length coding is used in text compression


applications such as ZIP and gzip to compress textual data and reduce the
storage and transmission requirements.

4. Network Protocols: Variable length coding is used in network protocols such


as HTTP, SMTP, and FTP to encode headers and control information, which
reduces the overhead and improves the efficiency of data transfer.

Dictionary based coding

Dictionary-based coding is a data compression technique that uses a dictionary


to represent repetitive patterns in data. It works by replacing frequently
occurring phrases or strings in the data with a shorter code or symbol, resulting
in a compressed version of the original data. This technique is widely used in
various applications that require efficient storage and transmission of large
amounts of data.

In this article, we will discuss the working of dictionary-based coding, its


advantages and disadvantages, and examples of its use in various applications.

How Dictionary-Based Coding Works

Dictionary-based coding is a lossless compression technique, which means that


the compressed data can be restored to its original form without any loss of
information. This technique works by identifying repetitive patterns in the data
and creating a dictionary of these patterns. The dictionary is a collection of
phrases or strings that occur frequently in the data.

The dictionary can be created using various methods such as statistical analysis,
machine learning algorithms, or manual construction. Once the dictionary is
created, the compression process begins by scanning the data for occurrences
of phrases in the dictionary. When a match is found, the phrase is replaced with
a shorter code or symbol from the dictionary.

The compressed data is typically represented as a sequence of codes or symbols


that correspond to the phrases in the dictionary. The decoder uses the same
dictionary to restore the original data by replacing the codes or symbols with
the corresponding phrases.

Advantages of Dictionary-Based Coding

1. Efficient Compression: Dictionary-based coding can achieve high compression


ratios by identifying and replacing repetitive patterns in the data. This
technique is especially effective for compressing text data, such as documents,
where words and phrases are often repeated.

2. Fast Decompression: Since the dictionary is shared between the encoder and
decoder, decompression can be performed quickly and efficiently. This makes
dictionary-based coding well-suited for applications that require real-time data
compression and decompression, such as video and audio streaming.

3. Flexibility: Dictionary-based coding can be adapted to different types of


data and applications by using different dictionaries and compression algorithms.
This flexibility allows it to be used in a wide range of applications, from text
compression to image and video compression.

Disadvantages of Dictionary-Based Coding

1. Dictionary Overhead: The size of the dictionary can have a significant impact
on the compression ratio and the compression speed. A larger dictionary can
capture more patterns in the data, but it also requires more memory and
processing power. A smaller dictionary, on the other hand, may not capture all
the patterns in the data, resulting in lower compression ratios.

2. Compression Speed: The time required to create the dictionary and perform
the compression can be a significant bottleneck in some applications. This can be
particularly problematic for applications that require real-time compression,
such as video and audio streaming.

3. Limited Compression for Random Data: Dictionary-based coding is most


effective for compressing data that contains repetitive patterns. If the data is
random or contains few repeating patterns, the compression ratio may be low.

Examples of Dictionary-Based Coding

1. ZIP Compression: ZIP is a popular compression format that uses dictionary-


based coding to compress files. The ZIP format uses a dictionary of up to 32KB
to compress the file data. This format is widely used for compressing and
archiving files for storage and transmission.

2. JPEG Compression: The JPEG image compression format uses dictionary-


based coding to compress the image data. The format uses a dictionary of
Huffman codes to represent the frequency of occurrence of different color
values in the image. This technique allows JPEG to achieve high compression
ratios while maintaining high image quality.

3. LZW Compression: LZW is a dictionary-based compression algorithm that is


widely used for compressing text data. The LZW algorithm uses a dynamic
dictionary to capture repeating patterns in the text data. This technique allows
LZW to achieve high compression ratios for text data.

(algorithm explained in class)

Transform Coding

Transform coding is a fundamental technique used in multimedia compression,


especially for image, video, and audio data. Unlike Run-Length Coding, which
reduces repeated patterns, transform coding works by converting spatial or
time-domain data into a frequency domain, allowing for more effective data
reduction by focusing on the most significant components and discarding less
critical ones.

Key Concepts of Transform Coding in Multimedia

⚫ Frequency Domain Transformation:

1. Transform coding involves converting the original multimedia data


(spatial domain for images and time domain for audio) into a frequency
domain using mathematical transforms, such as the Discrete Cosine
Transform (DCT) or Discrete Wavelet Transform (DWT).
2. In the frequency domain, high-frequency components (rapid changes in
data) are often less significant to human perception than low-frequency
components (smooth areas or slow changes). Transform coding exploits
this to reduce data.

⚫ Common Transforms:

1. Discrete Cosine Transform (DCT): The DCT is widely used in multimedia


compression standards, such as JPEG for images and MPEG for video. It
expresses a signal or image as a sum of cosine functions oscillating at
different frequencies.
2. Discrete Wavelet Transform (DWT): DWT is used in newer image
formats, like JPEG2000. It divides the signal into wavelets, which
represent data at multiple scales, making it highly effective for multi-
resolution analysis.
3. Fourier Transform: Although less common in direct multimedia
compression, the Fourier Transform is a mathematical foundation for
frequency-based analysis.

⚫ Coefficient Quantization:

1. Once the transform is applied, the resulting frequency coefficients are


quantized. Quantization reduces the precision of these coefficients,
significantly reducing data size. This step introduces controlled losses in
data, which is why transform coding is usually a lossy compression
technique.
2. The higher-frequency coefficients, which contain less visually or audibly
significant information, are often quantized more aggressively or even
discarded to maximize compression efficiency.

⚫ Compression and Reconstruction:

1. After quantization, the remaining coefficients are encoded and stored or


transmitted. To reconstruct the original data, an inverse transform (e.g.,
inverse DCT or inverse DWT) is applied to the quantized coefficients,
approximating the original image or audio.
2. The quantization and inverse transform steps introduce some loss, but
they aim to retain enough detail to maintain perceptual quality.

Applications of Transform Coding in Multimedia

• Image Compression: JPEG uses the DCT to compress images, dividing the
image into small blocks (usually 8x8 pixels), applying the DCT, and
quantizing the coefficients. The result is a compressed image with
reduced file size but minimal visible quality loss.
• Video Compression: Video codecs, like MPEG, H.264, and HEVC, apply
transform coding to each frame (often using DCT or variations). They
also exploit similarities between consecutive frames, further enhancing
compression.
• Audio Compression: In audio codecs like MP3 and AAC, transform coding
is used to discard inaudible frequencies, achieving high compression with
minimal impact on perceived quality.

Advantages of Transform Coding

• High Compression Ratios: Transform coding provides significant data


reduction while maintaining perceptual quality, especially for multimedia
data like images, audio, and video.
• Efficient Storage and Transmission: The reduced file size makes it ideal
for applications where storage and bandwidth are constrained.
• Flexibility with Different Data Types: Transform coding is versatile and
can be adapted for images, audio, and video.

Disadvantages of Transform Coding

• Complexity: Transform coding involves complex mathematical operations,


making it more computationally intensive than simpler methods like Run-
Length Coding.
• Lossy Compression: Transform coding typically introduces data loss due
to quantization, which may lead to visible or audible artifacts if overly
compressed.
• Block Artifacts: In block-based transform coding (like JPEG), noticeable
block artifacts can appear in highly compressed images or videos,
especially around the edges of 8x8 or 16x16 blocks.

Example of Transform Coding: JPEG Compression

1. Block Division: The image is divided into 8x8 pixel blocks.


2. DCT Transformation: Each block is transformed from the spatial
domain to the frequency domain using DCT.
3. Quantization: The DCT coefficients are quantized, with higher-
frequency components reduced or discarded.
4. Encoding: The quantized coefficients are further compressed
(often with methods like Huffman coding).
5. Reconstruction: The inverse DCT reconstructs the image upon
decompression, closely approximating the original.

Image and video compression techniques

Introduction to JPEG Compression


JPEG is an image compression standard which was developed by "Joint
Photographic Experts Group". In 1992, it was accepted as an international
standard. JPEG is a lossy image compression method. JPEG compression uses the
DCT (Discrete Cosine Transform) method for coding transformation. It allows a
tradeoff between storage size and the degree of compression can be adjusted.

Following are the steps of JPEG Image Compression-

Step 1: The input image is divided into a small block which is having 8x8
dimensions. This dimension is sum up to 64 units. Each unit of the image is called
pixel.

Step 2: JPEG uses [Y,Cb,Cr] model instead of using the [R,G,B] model. So in the
2nd step, RGB is converted into YCbCr.

Step 3: After the conversion of colors, it is forwarded to DCT. DCT uses a


cosine function and does not use complex numbers. It converts information?s
which are in a block of pixels from the spatial domain to the frequency domain.

DCT Formula
Step 4: Humans are unable to see important aspects of the image because they
are having high frequencies. The matrix after DCT conversion can only preserve
values at the lowest frequency that to in certain point. Quantization is used to
reduce the number of bits per sample.

There are two types of Quantization:

1. Uniform Quantization
2. Non-Uniform Quantization

Step 5: The zigzag scan is used to map the 8x8 matrix to a 1x64 vector. Zigzag
scanning is used to group low-frequency coefficients to the top level of the
vector and the high coefficient to the bottom. To remove the large number of
zero in the quantized matrix, the zigzag matrix is used.

Step 6: Next step is vectoring, the different pulse code modulation (DPCM) is
applied to the DC component. DC components are large and vary but they are
usually close to the previous value. DPCM encodes the difference between the
current block and the previous block.

Step 7: In this step, Run Length Encoding (RLE) is applied to AC components.


This is done because AC components have a lot of zeros in it. It encodes in pair
of (skip, value) in which skip is non zero value and value is the actual coded value
of the non zero components.

Step 8: In this step, DC components are coded into Huffman.

JPEG 4 Modes

• Sequential DCT based (Lossy)


• Progressive DCT based (Lossy)
• Sequential lossless, DPCM based
• Hierarchical
JPEG (Joint Photographic Experts Group) compression is a widely used method
for reducing the file size of digital images while balancing image quality. It is
primarily used for photographic images due to its ability to retain fine color and
detail at high compression rates. Here is a detailed breakdown of its modes of
compression:

1. Lossy Compression

JPEG's standard form of compression is lossy, meaning that some image data is
permanently discarded to achieve a smaller file size. Here’s how it works in
detail:

A. Discrete Cosine Transform (DCT)

• The first step in JPEG compression involves dividing the image into small
blocks, typically 8x8 pixels.
• Each block is transformed from the spatial domain (pixel values) into the
frequency domain using a mathematical function called the Discrete
Cosine Transform.
• The DCT separates the image data into different frequency components:
o Low frequencies contain important structural and smooth gradient
information.
o High frequencies represent finer details and noise.
• The goal of this transformation is to concentrate the image's
information in fewer coefficients, making it easier to identify and
discard less important data later in the process.

B. Quantization

• After the DCT transformation, the frequency components are quantized.


This step drastically reduces the data size by dividing each coefficient
by a quantization value and rounding it to the nearest integer.
• Quantization is where most of the compression occurs and is responsible
for the loss of image detail:
o Lower frequencies (essential details) are given higher priority
(lower quantization values) to preserve essential features of the
image.
o Higher frequencies (fine details and noise) are aggressively
quantized, often leading to zero values, which results in the loss of
details and smoother areas.
• The extent of quantization is adjustable, enabling users to balance image
quality and file size.

C. Entropy Coding
• The final step in JPEG lossy compression is entropy coding, which further
compresses the quantized coefficients using lossless methods like:
o Huffman Coding: Encodes frequently occurring values with
shorter codes and less common values with longer codes.
o Arithmetic Coding: An alternative to Huffman coding that
provides slightly better compression efficiency by using more
complex encoding algorithms.
• Entropy coding reduces the redundancy in the data, resulting in the final
compressed JPEG file.

2. Progressive JPEG Compression

In progressive JPEG compression, the image is encoded in multiple passes or


scans of increasing detail. This mode is particularly useful for web applications
where a low-resolution version of the image can be quickly displayed, and
subsequent passes progressively improve the resolution and detail.

• First Scan: Initially, a rough approximation of the image is displayed,


giving users a quick preview.
• Subsequent Scans: Each pass adds more detail, refining the image
progressively until it reaches full quality.
• This method does not change the overall file size significantly but
provides a better user experience for images viewed over slower
connections.

3. Baseline JPEG Compression

Baseline JPEG is the most common and simplest mode of JPEG compression. It is
characterized by:

• Sequential Encoding: The image data is encoded and transmitted one row
at a time, from top to bottom.
• Compatibility: Baseline JPEG is supported by virtually all web browsers
and image viewers, making it a standard for most applications.

4. Lossless JPEG Compression

JPEG also offers a less commonly used lossless compression mode:

• In this mode, no data is discarded, and the image is compressed without


any loss of quality.
• It uses predictive coding instead of the DCT-based transformation:

o Prediction: Each pixel’s value is predicted based on the values of


neighboring pixels.
o Difference Calculation: The difference between the predicted
and actual pixel values is encoded and stored.
• Lossless compression results in a lower compression ratio compared to
lossy JPEG but retains the full original image quality.

5. Hierarchical JPEG Compression

This mode is designed for multi-resolution images:

• The image is encoded at multiple resolutions, starting from a low-


resolution version and progressively increasing to higher resolutions.
• Different resolutions can be displayed or transmitted based on network
conditions, display requirements, or zoom levels.

Key Points to Remember

• Lossy compression results in higher compression ratios but sacrifices


image quality by removing less visually significant data.
• Progressive JPEGs improve user experience by displaying images
incrementally.
• Lossless JPEG preserves the original image data but achieves less
compression.
• Baseline JPEG is the simplest, widely supported form used for most
JPEG images.
• Hierarchical mode allows flexible resolution handling for different
applications.

Video Compression
Implementing effective video compression into your digital platforms requires
knowledge of the most common codecs, digital formats, and optimization
strategies.

What Is Video Compression?

Video compression is the process of reducing the file size of a video file by
reducing the amount of data needed to represent its content, but without losing
too much visual information. It's what powers every internet video streaming
platform, including YouTube, Netflix, and TikTok. Video conferencing services
like Facetime and Google Meet also compress video on the fly, with both maxing
out videos to just 1080p.

Even the clearest, most high-resolution video you watch via the internet has
been compressed, otherwise you might not be able to stream it.

There are two primary types of compression: lossy and lossless.


• Lossy compression prioritizes space savings, meaning quality might take
more of a hit in favor of reducing the file size as much as possible. The
original and the compressed versions will have some differences in quality
levels, but they're usually not enough to impact the viewing experience.
It's ideal for cases where bandwidth is a concern, which is why most
streaming platforms use it.

• Lossless compression retains as much detail as possible, and you usually


end up with larger file sizes. This type of compression is a popular choice
for archival purposes and is less common in the context of video
streaming.

Some of the benefits of using video compression include:

• Reducing storage costs: Companies that host video deal with overhead
for storage, and compression helps them reduce that.

• Improving the user experience: Compressed video requires less


bandwidth to transmit, giving users smooth playback even on slower connections.

• Increasing device compatibility: When video files are compressed,


they're converted into a state that matches more device capabilities.

How Does Video Compression Work?

Video compression works by using a program called a codec to get rid of the
redundant and/or imperceptible elements of a video. This is how it reduces the
size of the file without affecting the quality of the footage. The difference
between the original and compressed video's size is called the compression ratio.

There are two video compression algorithms that are in popular use today:
intraframe and interframe compression.

1. Intraframe or spatial compression finds disposable elements in every


video frame and compresses each one individually. It saves more detail
from the original video and is ideal for footage with a lot of movement
and visual changes.

2. Interframe or temporal compression works on a video chunk at once. It


saves more space and is ideal for footage with little movement.

3. Modern video compression tools often leverage both of these algorithms


to get the best results on each task.

Take a video where the camera is on a tripod and the speaker is seated in front
of a full-wall mural. Since only the speaker is moving and the background isn't,
it's possible to save only the elements that change and reuse the same version
of the rest across multiple frames of video. This is how temporal compression
gets rid of redundant details.

If two or more adjacent pixels represent nearly identical colors, it's possible to
combine them into one color without it being detectable to the human eye. This
process is called chroma subsampling, and it's an example of how compression
gets rid of imperceptible details.

Other areas where video compression reduces details are in the file's frequency,
dynamic range, and frame rate. For example, if a video was shot at 60 frames
per second, compression could easily bring that down to 24 without a significant
loss of quality.

After compression, delivery of video content on slow network connections


becomes easier, through avenues like the Secure Reliable Transport (SRT)
protocol.

H.261

H.261 is one of the earliest video compression standards developed by the


International Telecommunication Union (ITU) for video conferencing and
telephony over Integrated Services Digital Network (ISDN) lines. It served as a
foundation for later video compression standards, including H.263, H.264 (AVC),
and H.265 (HEVC). Here is a detailed overview of the key features and
techniques used in H.261 compression:

Key Features of H.261

• Target Bitrate: H.261 was designed to operate at bitrates that are


multiples of 64 kbps, starting from 64 kbps up to 2 Mbps.
• Frame Types: H.261 defines two types of frames:
o Intra Frames (I-frames): These are encoded independently of
any other frames, similar to standalone images. They serve as
reference frames for subsequent frames, providing a starting
point for motion compensation.
o Inter Frames (P-frames): These use temporal prediction to
reduce redundancy by referencing previous I-frames or P-frames,
thus encoding only changes (motion) between consecutive frames.

Key Compression Techniques in H.261

⚫ Block-based Motion Compensation (Inter-frame Compression)

o Macroblocks: The image is divided into 16x16 pixel macroblocks


for processing.
o Motion Estimation and Compensation: H.261 reduces temporal
redundancy by tracking the movement of macroblocks between
consecutive frames. If a macroblock has moved, its motion vector
is calculated and encoded. Only the differences between the
predicted and actual blocks are transmitted.
o Predicted (P) Frames: These frames use data from previously
transmitted frames to predict and encode changes. This is more
efficient than transmitting the entire frame.

⚫ Discrete Cosine Transform (DCT) (Intra-frame Compression)

o The macroblocks are further divided into smaller 8x8 blocks, and
each block undergoes the DCT. This transforms spatial pixel data
into frequency domain data, concentrating most of the visual
information into a few coefficients.
o Quantization: The DCT coefficients are quantized to reduce
precision and achieve compression. High-frequency coefficients,
which represent fine details, are often reduced or discarded.
o Quantization introduces data loss but significantly reduces the
data size, making the compression more efficient. The degree of
quantization can be adjusted based on desired compression levels.

⚫ Entropy Coding

o Run-Length Encoding (RLE): This technique is used to encode


sequences of zeros that frequently occur in the quantized DCT
coefficients, reducing the data size.
o Variable-Length Coding (VLC): Symbols (like motion vectors and
DCT coefficients) are encoded using Huffman coding or similar
variable-length coding techniques, with more common symbols
assigned shorter codes.

⚫ Loop Filter (Deblocking Filter)

o To minimize visible blocking artifacts (caused by dividing the


frame into blocks), H.261 employs a simple deblocking filter that
smooths out edges and reduces noticeable blockiness.

H.261 Compression Process

1. Frame Division: Each video frame is divided into macroblocks of 16x16


pixels.
2. Prediction and Motion Compensation: For inter-frame compression,
motion vectors are calculated to predict the position of macroblocks in
subsequent frames.
3. DCT Transformation: Each 8x8 block within a macroblock is transformed
using the DCT to separate frequency components.
4. Quantization: DCT coefficients are quantized, discarding less significant
data to achieve compression.
5. Encoding: Quantized coefficients, motion vectors, and other data are
encoded using variable-length and run-length coding.
6. Decoding: The process is reversed during decoding, with motion vectors
and quantized coefficients used to reconstruct the video.

Applications and Limitations of H.261

• Applications: H.261 was primarily used for video conferencing, telephony,


and early multimedia applications over ISDN.
• Limitations: Compared to modern video codecs like H.264 and H.265,
H.261 achieves much lower compression efficiency, and its image quality
is limited at lower bitrates. However, it laid the groundwork for future
standards by introducing many foundational concepts like macroblocks,
DCT-based compression, and motion compensation.

H.263

H.263 is a video compression standard developed as an improvement over the


earlier H.261 standard, mainly for low-bitrate communications such as video
conferencing, video telephony, and internet streaming. It was standardized by
the International Telecommunication Union (ITU) in 1995 and has since seen
multiple revisions with enhancements. H.263 builds on the core principles of its
predecessor, H.261, but introduces various improvements that make it more
efficient, leading to better compression at a given level of video quality. Here’s a
detailed explanation of how H.263 works and what makes it significant:

Key Features and Improvements Over H.261

• Enhanced Compression Efficiency: H.263 offers better compression


compared to H.261 at the same bitrate, making it suitable for
applications over low-bandwidth networks.
• Wider Range of Applications: Originally targeted for video conferencing
over ISDN and POTS (Plain Old Telephone Service) lines, H.263 has been
widely used for internet-based video streaming and mobile applications.
• Improved Motion Compensation: H.263 includes more precise motion
compensation features, enabling more efficient inter-frame prediction.

Key Techniques Used in H.263 Compression

A. Block-based Hybrid Compression


Like H.261, H.263 employs block-based compression techniques with several
enhancements. Here’s how it works:

• Macroblocks and Block Structures: H.263 divides each video frame into
macroblocks (16x16 pixels), which are then further subdivided into
smaller 8x8 blocks for more efficient processing.
• Intra and Inter Prediction:

o Intra (I) Frames: These frames are encoded independently


without reference to other frames. Intra frames provide
reference points for inter-coded frames.
o Inter (P) Frames: These frames rely on motion estimation and
compensation from previously encoded frames (I-frames or P-
frames) to reduce temporal redundancy.

B. Motion Estimation and Compensation

• Sub-pixel Precision: H.263 improves motion estimation accuracy by using


half-pixel precision for motion vectors, leading to better representation
of motion in video sequences compared to H.261's integer-pixel accuracy.
• Motion Vectors and Macroblock Prediction: Similar to H.261, H.263 uses
motion vectors to predict the movement of macroblocks between frames.
The prediction error (the difference between the predicted and actual
block) is then encoded, which typically requires less data.

C. Discrete Cosine Transform (DCT)

• 8x8 DCT Blocks: Like H.261, each macroblock in H.263 is divided into
8x8 blocks, and the DCT is applied to convert the spatial domain data into
the frequency domain.
• Quantization: The DCT coefficients are quantized to reduce precision,
achieving compression by discarding less significant information (typically
higher frequencies). This step is lossy, contributing to compression
efficiency but leading to data loss.

D. Entropy Coding

• Variable-Length Coding (VLC): Quantized coefficients and other video


data are encoded using variable-length codes, such as Huffman coding, to
reduce redundancy.
• Run-Length Encoding (RLE): This technique is used to compress
sequences of zeros in the quantized data for more efficient storage and
transmission.

3. Advanced Compression Modes in H.263


H.263 introduced several optional features and modes that enhance compression
efficiency and video quality, especially in its later revisions:

• PB-Frames (Picture-in-Picture): This feature allows two frames (a P-


frame and a B-frame) to be encoded as a single unit, improving the coding
efficiency for motion prediction.
• Unrestricted Motion Vectors: This mode enables motion vectors to point
outside the boundaries of the picture, improving motion compensation
accuracy.
• Advanced Prediction Mode: This allows for four motion vectors per
macroblock, increasing motion compensation precision and improving
compression for scenes with complex motion.
• Advanced INTRA Coding: H.263 provides an optional mode for more
accurate intra-frame prediction, improving the quality of intra-coded
frames.
• Deblocking Filter: An optional deblocking filter helps reduce visible block
artifacts, enhancing visual quality at lower bitrates.

4. H.263+ and H.263++ Enhancements

Subsequent revisions introduced new optional modes and features, collectively


known as H.263+ (H.263v2) and H.263++ (H.263v3), to further improve
performance:

• H.263+ introduced additional optimizations like better support for


mobile and internet video applications, improved error resilience, and
more flexible bitstream syntax.
• H.263++ brought even more features for efficient compression and
error resilience, allowing H.263 to remain relevant as video conferencing
and streaming over unreliable networks became more widespread.

5. Applications and Legacy of H.263

H.263 was widely adopted for early internet video applications, mobile video
telephony, and streaming services. It played a significant role in low-bitrate
video compression during the early days of online multimedia and paved the way
for subsequent video standards such as H.264 (AVC), which borrowed many
concepts from H.263 but introduced significant improvements in efficiency and
versatility.

Motion Picture Experts Group (MPEG)


What is MPEG?
MPEG (Motion Pictures Experts Group) is a family of standards for audio and
video compression and transmission. It is developed and maintained by the
Motion Pictures Experts Group, a working group of the International
Organization for Standardization (ISO) and the International Electrotechnical
Commission (IEC).

There are several different types of MPEG standards, including −

MPEG-1 −This standard is primarily used for audio and video compression for
CD-ROMs and low-quality video on the internet.

MPEG-2 −This standard is used for digital television and DVD video, as well as
high-definition television (HDTV).

MPEG-4 −This standard is used for a wide range of applications, including video
on the internet, mobile devices, and interactive media.

MPEG-7 −This standard is used for the description and indexing of audio and
video content.

MPEG-21 − This standard is used for the delivery and distribution of


multimedia content over the internet.

MPEG uses a lossy form of compression, which means that some data is lost
when the audio or video is compressed. The degree of compression can be
adjusted, with higher levels of compression resulting in smaller file sizes but
lower quality, and lower levels of compression resulting in larger file sizes but
higher quality.

Advantage of MPEG

There are several advantages to using MPEG −

High compression efficiency − MPEG is a highly efficient compression standard


and can significantly reduce the file size of audio and video files while
maintaining good quality.

Widely supported − MPEG is a widely used and well-established audio and video
format, and it is supported by a wide range of media players, video editors, and
other software.
Good quality − While MPEG uses lossy compression, it can still produce good
quality audio and video at moderate to high compression levels.

Flexible − The degree of compression used in an MPEG file can be adjusted,


allowing you to choose the balance between file size and quality.

Versatile − MPEG can be used with a wide range of audio and video types,
including music, movies, television shows, and other types of multimedia content.

Streamable − MPEG files can be streamed over the internet, making it easy to
deliver audio and video content to a wide audience.

Scalable − MPEG supports scalable coding, which allows a single encoded video to
be adapted to different resolutions and bitrates. This makes it well-suited for
use in applications such as video-on-demand and live streaming.

Disadvantage of MPEG

There are also some disadvantages to using MPEG −

Lossy compression − Because MPEG uses lossy compression, some data is lost
when the audio or video is compressed. This can result in some loss of quality,
particularly at higher levels of compression.

Limited color depth − Some versions of MPEG have a limited color depth and can
only support 8 bits per channel. This can result in visible banding or other
artifacts in videos with high color gradations or smooth color transitions.

Non-ideal for text and graphics − MPEG is not well suited for video with sharp
transitions, high-contrast text, or graphics with hard edges. These types of
video can appear pixelated or jagged when saved as MPEG.

Complexity − The MPEG standards are complex and require specialized software
and hardware to encode and decode audio and video.

Patent fees − Some MPEG standards are covered by patents, which may require
the payment of licensing fees to use the technology.

Compatibility issues − Some older devices and software may not support newer
versions of the MPEG standard.
How big is a MPEG file, in bytes?

The size of a MPEG file depends on the length, resolution, and complexity of the
audio or video, as well as the level of compression used. In general, MPEG files
can range in size from a few hundred kilobytes to several gigabytes.

As a rough estimate, a typical high-resolution video saved as a MPEG at a


moderate level of compression might be around 500 megabytes per hour of
video. However, the exact size of a MPEG file can vary significantly depending
on the specific characteristics of the audio or video.

It's worth noting that the size of a MPEG file can be reduced by increasing the
level of compression or by resizing the video to a smaller resolution. However,
increasing the level of compression or resizing the video can also result in a loss
of quality.

You might also like