0% found this document useful (0 votes)
10 views22 pages

DCT Sample Solved

The document outlines the syllabus and examination structure for the Data Compression Techniques course at APJ Abdul Kalam Technological University. It includes various topics such as performance metrics for data compression, mathematical models for lossless and lossy compression, and comparisons between different coding techniques like Huffman and Arithmetic coding. Additionally, it discusses specific compression methods like LZ77, JPEG, MPEG, and concepts related to audio compression.

Uploaded by

Unknown User
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views22 pages

DCT Sample Solved

The document outlines the syllabus and examination structure for the Data Compression Techniques course at APJ Abdul Kalam Technological University. It includes various topics such as performance metrics for data compression, mathematical models for lossless and lossy compression, and comparisons between different coding techniques like Huffman and Arithmetic coding. Additionally, it discusses specific compression methods like LZ77, JPEG, MPEG, and concepts related to audio compression.

Uploaded by

Unknown User
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY

EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR


Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

PART A
Answer All Questions. Each Question Carries 3 Marks
1. Specify different quantities used to measure the performance of a data
compression ratio -Ratio of the number of bits required to represent the data before
compression to the number of bits required to represent the data after compression
rate - the average number of bits required to represent a single sample
In lossy compression, the reconstruction differs from the original data.The difference
between the original and the reconstruction is often called the distortion.
Fidelity and quality. When we say that the fidelity or quality of a reconstruction is
high, we mean that the difference between the reconstruction and the original is small
If asked for 7 marks can provide examples.
2. Explain mathematical model for lossless compression
 Physical Models
Physics of the data generation process • In speech-related applications, knowledge
about the physics of speech production can be used to construct a mathematical model
for the sampled speech process • Models for certain telemetry data can also be obtained
through knowledge of the underlying process • Residential electrical meter readings at
hourly intervals were to be coded, knowledge about the living habits of the populace
could be used to determine when electricity usage would be high and when the usage
would be low. • Instead of the actual readings, the difference (residual) between the
actual readings and those predicted by the model could be coded.
 Probability Models
• Ignorance model
assume that each letter that is generated by the source is independent of every
other letter, and each occurs with the same probability
• Probability model
assume that each letter that is generated by the source is independent of every
other letter, and each occurs with the different probability • For a source that
generates letters from an alphabet A= a1, a2,….aM , we can have a probability
model P= Pa1, Pa2,…….PaM
3. State and prove Kraft-McMillan inequality
Let C be a code with N codewords with lengths l1,l2…ln. If C is uniquely decodable,
then

This inequality is known as the Kraft-McMillan inequality.


The Kraft-McMillan inequality is a necessary condition for a variable-length code to
be uniquely decodable The Kraft-McMillan inequality is concerned with the existence
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

of a uniquely decodable (UD) code. It establishes the relation between such a code and
the lengths L of its codewords.The first part provides necessary condition on the
codeword lengths of uniquely decodable codes. The second part shows that we can
always find a prefix code that satisfies this necessary condition.Therefore, if we have a
uniquely decodable code that is not a prefix code, we can always find a prefix code with
the same codeword lengths.If we have a uniquely decodable code, the codeword lengths
have to satisfy the Kraft-McMillan inequality. And, given codeword lengths that satisfy
the Kraft-McMillan inequality, we can always find a prefix code with those codeword
lengths.
4. Compare Huffman and Arithmetic coding
Arithmetic Coding is more complicated than Huffman coding, but it allows us to code
sequences of symbol. Arithmetic coding is not a good idea if you are going to encode
your message one symbol at a time. As we increase the number of symbols per message,
our results get better and better.
To generate a codeword for a sequence of length m, using the Huffman procedure
requires building the entire code for all possible sequences of length m. If the original
alphabet size was k, then the size of the codebook would be k to power m. For the
arithmetic coding procedure, we do not need to build the entire codebook. Instead, we
simply obtain the code for the tag corresponding to a given sequence.
If the alphabet size is relatively large and the probabilities are not too skewed, the
maximum probability pmax is generally small. In these cases, the advantage of
arithmetic coding over Huffman coding is small, and it might not be worth the extra
complexity to use arithmetic coding rather than Huffman coding. However, there are
many sources, such as facsimile, in which the alphabet size is small, and the
probabilities are highly unbalanced. In these cases, the use of arithmetic coding is
generally worth the added complexity.
It is much easier to adapt arithmetic codes to changing input statistics.
5. Describe LZ77 approach of encoding a string with the help of an example
In the LZ77 approach, the dictionary is simply a portion of the previously encoded
sequence. The encoder examines the input sequence through a sliding window. The
window consists of two parts, a search buffer that contains a portion of the recently
encoded sequence, and a look-ahead buffer that contains the next portion of the
sequence to be encoded.
To encode the sequence in the look-ahead buffer, the encoder moves a search pointer
back through the search buffer until it encounters a match to the first symbol in the
look-ahead buffer. The distance of the pointer from the look-ahead buffer is called the
offset. The encoder then examines the symbols following the symbol at the pointer
location to see if they match consecutive symbols in the look-ahead buffer.
The number of consecutive symbols in the search buffer that match consecutive
symbols in the look-ahead buffer, starting with the first symbol, is called the length of
the match. The encoder searches the search buffer for the longest match. Once the
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

longest match has been found, the encoder encodes it with a triple <o,l,c> where o is
the offset, l is the length of the match, and c is the code word corresponding to the
symbol in the look-ahead buffer that follows the match

6. Compare and contrast JPEG and JPEG-LS differences in working.


lossless (or near-lossless) compression of continuous tone images,popularly known as
JPEG-LS.
This is a new method, designed to be simple and fast. It does not use the DCT, does not
employ arithmetic coding, and uses quantization in a restricted way, and only in its
near-lossless option. JPEGLS is based on ideas developed in [Weinberger et al. 96 and
00] and for their LOCO-I compression method. JPEG-LS examines several of the
previously-seen neighbors of the current pixel, uses them as the context of the pixel,
uses the context to predict the pixel and to select a probability distribution out of several
such distributions, and uses that distribution to encode the prediction error with a
special Golomb code. There is also a run mode, where the length of a run of identical
pixels is encoded.
The encoder examines the context pixels and decides whether to encode the current
pixel x in the run mode or in the regular mode. If the context suggests that the pixels y,
z,. . .following the current pixel are likely to be identical, the encoder selects the run
mode. Otherwise, it selects the regular mode. In the near-lossless mode the decision is
slightly different. If the context suggests that the pixels following the current pixel are
likely to be almost identical (within the tolerance parameter NEAR), the encoder selects
the run mode.
7. Discuss different components of video
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

MPEG uses I, P, and B pictures.They are arranged in groups, where a group can be
open or closed.The pictures are arranged in a certain order, called the coding order, but
(afterbeing decoded) they are output and displayed in a different order, called the
display order.In a closed group, P and B pictures are decoded only from other pictures
in the group.In an open group, they can be decoded from pictures outside the
group.Different regions of a B picture may use different pictures for their decoding.A
region may be decoded from some preceding pictures, from some following pictures,
from both types, or from none.A region in a P picture may use several preceding
pictures for its decoding, or use none at all, in which case it is decoded using MPEG’s
intra methods. The basic building block of an MPEG picture is the macroblock.It
consists of a 16×16 block of luminance (grayscale) samples(divided into four 8×8
blocks) and two 8 × 8 blocks of the matching chrominance samples.The MPEG
compression of a macroblock consists mainly in passing each of the six blocks through
a discrete cosine transform, which creates decorrelated values, then quantizing and
encoding the results.It is very similar to JPEG compression, the main differences being
that different quantization tables and different code tables are used in MPEG for intra
and non intra, and the rounding is done differently. A picture in MPEG is organized in
slices. Slice is a contiguous set of macroblocks (in raster order) that have the same
grayscale (i.e., luminance component). The concept of slices makes sense because a
picture may often contain large uniform areas, causing many contiguous macroblocks
to have the same grayscale. A slice can continue from scan line to scan line When a
picture is encoded in nonintra mode (i.e., it is encoded by means of another picture,
normally its predecessor), the MPEG encoder generates the differences between the
pictures, then applies the DCT to the differences, it is followed by quantization.

8. Identify the advantage of MPEG-4 over MPEG


The compression paradigm adopted for MPEG-4 is based on objects “coding of
audiovisual objects.Defining objects, such as a flower, a face, or a vehicle, and then
describing how each object should be moved and manipulated in successive frames.bA
flower may open slowly, a face may turn, smile, and fade, a vehicle may move toward
the viewer and appear bigger. MPEG-4 includes an object description language that
provides for a compact description of both objects and their movements and
interactions. Another important feature of MPEG-4 is interoperability. term refers to
the ability to exchange any type of data, be it text, graphics, video, or audio. MPEG-1
was originally developed as a compression standard for interactive video on CDs and
for digital audio broadcasting. interactive CDs and digital audio broadcasting have had
little commercial success, so MPEG-1 is used today for general video compression.
9. Explain critical bands, thresholding and masking related to audio compression
The frequency range of the human ear is from about 20 Hz to about 20,000 Hz, but the
ear’s sensitivity to sound is not uniform. It depends on the frequency. The existence of
the hearing threshold suggests an approach to lossy audio compression. Just delete any
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

audio samples that are below the threshold. Since the threshold depends on the
frequency, the encoder needs to know the frequency spectrum of the sound being
compressed at any time.
The range of audible frequencies can therefore be partitioned into a number of critical
bands that indicate the declining sensitivity of the ear (rather, its declining resolving
power) for higher frequencies. We can think of the critical bands as a measure similar
to frequency. However, in contrast to frequency, which is absolute and has nothing to
do with human hearing, the critical bands are determined according to the sound
perception of the ear. Thus, they constitute a perceptually uniform measure of
frequency.
two more properties of the human hearing system are used in audio compression. They
are frequency masking and temporal masking.Frequency masking (also known as
auditory masking) occurs when a sound that we can normally hear (because it is loud
enough) is masked by another sound with a nearby frequency. This source raises the
normal threshold in its vicinity (the dashed curve), with the result that the nearby sound
represented by t “x”, a sound that would normally be audible because it is above the
threshold, is now masked, and is inaudible .A good lossy audio compression method
should identify this case and delete the signals corresponding to sound “x”, because it
cannot be heard anyway. This is one way to lossily compress sound.The frequency
masking depends on the frequency. It varies from about 100 Hz for the lowest audible
frequencies to more than 4 kHz for the highest.
Temporal masking may occur when a strong sound A of frequency f is preceded or
followed in time by a weaker sound B at a nearby (or the same) frequency. If the time
interval between the sounds is short, sound B may not be audible.
Part B
(Answer any one question from each module. Each question carries 14 Marks)
10. (a) Explain mathematical model for lossy compression and lossless compression
(10)
Mathematical model for lossless compression
If the experiment is a source that puts out symbols Ai from a set A , then the entropy is
a measure of the average number of binary symbols needed to code the output of the
source.
Physical Models • Physics of the data generation process • In speech-related
applications, knowledge about the physics of speech production can be used to
construct a mathematical model for the sampled speech process • Models for certain
telemetry data can also be obtained through knowledge of the underlying process
• Residential electrical meter readings at hourly intervals were to be coded, knowledge
about the living habits of the populace could be used to determine when electricity
usage would be high and when the usage would be low. • Instead of the actual readings,
the difference (residual) between the actual readings and those predicted by the model
could be coded.
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

Probability Models • Ignorance model • assume that each letter that is generated by the
source is independent of every other letter, and each occurs with the same probability •
Probability model • assume that each letter that is generated by the source is
independent of every other letter, and each occurs with the different probability • For a
source that generates letters from an alphabet A= a1, a2,….aM , we can have a
probability model P= Pa1, Pa2,…….PaM .
Mathematical model for lossy compression
• Uniform distribution - this is an ignorance model. we do not know anything about
the distribution of the source output, except possibly the range of values, we can use
the uniform distribution to model the source.

• Gaussian distribution - a normal distribution or Gaussian distribution is a type of


continuous probability distribution for a real-valued random variable .it is symmetric
about the mean

• Laplacian distribution - distributions that are quite peaked at zero. For example,
speech consists mainly of silence. Therefore, samples of speech will be zero or close to
zero with high probability.

• Gamma distribution - distribution that is more peaked an considerably less tractable


APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

(b) Define compression ratio with an example (4)


compression ratio -Ratio of the number of bits required to represent the data before
compression to the number of bits required to represent the data after compression.
Storing an image made up of a square array of 256×256 pixels requires 65,536 bytes. •
the compressed version requires 16,384 bytes. • the compression ratio is 4:1. • We can
also represent the compression ratio by expressing the reduction in the amount of data
required as a percentage of the size of the original data. In this particular example the
compression ratio would be 75%.
OR
11. (a) Discuss any probability model and identify the shortcoming of the solution. (7)
Gaussian Distribution: The Gaussian distribution is one of the most commonly used
probability models for two reasons: it is mathematically tractable and, by virtue of the
central limit theorem, it can be argued that in the limit the distribution of interest goes
to a Gaussian distribution. The probability density function for a random variable with
a Gaussian distribution is

Many sources that we deal with have distributions that are quite peaked at zero. For
example, speech consists mainly of silence. Therefore, samples of speech will be zero
or close to zero with high probability. Image pixels themselves do not have any
attraction to small values. However, there is a high degree of correlation among pixels.
Therefore, a large number of the pixel-to-pixel differences will have values close to
zero. In these situations, a Gaussian distribution is not a very close match to the data.

(b) Identify the mathematical preliminaries for Lossless Compression (7)

1. Models
If the experiment is a source that puts out symbols Ai from a set A , then the entropy is
a measure of the average number of binary symbols needed to code the output of the
source.

Physical Models • Physics of the data generation process • In speech-related


applications, knowledge about the physics of speech production can be used to
construct a mathematical model for the sampled speech process • Models for certain
telemetry data can also be obtained through knowledge of the underlying process
• Residential electrical meter readings at hourly intervals were to be coded, knowledge
about the living habits of the populace could be used to determine when electricity
usage would be high and when the usage would be low. • Instead of the actual readings,
the difference (residual) between the actual readings and those predicted by the model
could be coded.
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

Probability Models • Ignorance model • assume that each letter that is generated by the
source is independent of every other letter, and each occurs with the same probability •
Probability model • assume that each letter that is generated by the source is
independent of every other letter, and each occurs with the different probability • For a
source that generates letters from an alphabet A= a1, a2,….aM , we can have a
probability model P= Pa1, Pa2,…….PaM .
2. Coding
Uniquely Decodable Codes
A code is distinct if each codeword is recognizable from every other (i.e., the planning
from source messages to codewords is coordinated). A distinct code is extraordinarily
decodable if each codeword is recognizable when drenched in a grouping of codewords
or if the first source arrangement can be remade consummately from the encoded binary
sequence. Unique decodability ensures that codewords can be recognized
unambiguously in the received signal so that the decoding process is the exact inverse
of the encoding process.
A code is uniquely decodable if the mapping C+ : A+X → A+Z is one-to-one
A prefix code is a variable-size code that satisfies the prefix property. This property
requires that once a certain bit pattern has been assigned as the code of a symbol, no
other codes should start with that pattern (the pattern cannot be the prefix of any other
code).

3. Algorithmic information Theory


quantifying the amount of information contained in individual object or sequences.

 Algorithmic Information Theory revolutionizes data compression. It provides


insights into compression limits and data complexity.
 Algorithms like Huffman coding and arithmetic coding are derived from its concepts.
It has applications in both lossless and lossy compression.
 It guides model selection for optimal compression. Algorithmic Information Theory
contributes to universal compression.
 It enables efficient data representation and storage.Understanding its principles helps
improve compression techniques.
4. M i n i m u m D e s c r i p t i o n L e n g t h P r i n c i p l e
Let M j be a model from a set of models that attempt to characterize the structure in a
sequence x. Let DM j be the number of bits required to describe the model Mj. For
example, if the set of models can be represented by a (possibly variable) number of
coefficients, then the description of Mj would include the number of coefficients and
the value of each coefficient. Let RMj (x) be the number of bits required to represent x
with respect to the model M j. The minimum description length would be given by
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

13. (a) With a help of flowchart discuss the RLE text compression for text data
given below

‘ABBBBBBBBBCDEEEEF’ (10)
The compressed text will be
A@9BCD@4EF

(b) calculate the compression ratio for the example while taking repetitions = 4 (4)

we assume a string of N characters that needs to be compressed. We assume that the


string contains M repetitions of average length L each.
N=17
Given M=4
And if L=3
Compression factor is 17/17-4(3-3)=1
OR
12. (a) Illustrate with an example why Huffman coding is preferred than Shannon
Fano Algorithm for compression (10)
Huffman coding is a lossless data compression algorithm
• assign variable-length codes to input characters, lengths of the assigned codes are
based on the frequencies/probability of corresponding characters. The variable-length
codes assigned to input characters are Prefix Codes, means the codes (bit sequences)
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

are assigned in such a way that the code assigned to one character is not the prefix of
code assigned to any other character algorithm builds a tree in bottom up manner
The prior difference between the Huffman coding and Shannon fano coding is that the
Huffman coding suggests a variable length encoding. Conversely, in Shannon fano
coding the codeword length must satisfy the Kraft inequality where the length of the
codeword is limited to the prefix code.
It works by translating the characters contained in a data file into a binary code.
However, the major characters in the file have the smallest binary code, and the least
occurring characters have the longest binary codes.
Shannon Fano algorithm also uses the probabilities of the data to encode it. Although,
it does not ensures the optimal code generation. It is considered as the technique of
constructing prefix codes in accordance with the group of symbols and probabilities.
The Huffman coding employs the prefix code conditions while Shannon fano coding
uses cumulative distribution function.

(b) How Huffman coding is handling the unpredictability of input data stream (4)
15. (a) Explain in detail the working of LZ78 with example and dictionary Tree
(10)
 The LZ77 approach implicitly assumes that like patterns will occur close together.
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

 It makes use of this structure by using the recent past of the sequence as the dictionary
for encoding.

 this means that any pattern that recurs over a period longer than that covered by the
coder window will not be captured.

 This is a periodic sequence with a period of nine. If the search buffer had been just one
symbol longer, this sequence could have been significantly compressed.

 the net result can be an expansion rather than a compression.

 The LZ78 algorithm solves this problem by dropping the reliance on the search buffer
and keeping an explicit dictionary.
 This dictionary has to be built at both the encoder and decoder, and care must be taken
that the dictionaries are built in an identical manner.
 The inputs are coded as a double < i, c> with i being an index corresponding to the
dictionary entry that was the longest match to the input, and c being the code for the
character in the input following the matched portion of the input.
 As in the case of LZ77, the index value of 0 is used in the case of no match. This double
then becomes the newest entry in the dictionary.
 each new entry into the dictionary is one new symbol concatenated with an existing
dictionary entry

 Let us encode the following sequence using the LZ78 approach:

 where stands for space. Initially, the dictionary is empty, so the first few symbols
encountered are encoded with the index value set to 0. The first three encoder outputs
are <0 ,C(w ) >, <0, C( a ) >, <0 ,C(b ) >
 The fourth symbol is a b, which is the third entry in the dictionary.
 If we append the next symbol, we would get the pattern ba, which is not in the dictionary
 so we encode these two symbols as <3,C(a ) >and add the pattern ba as the fourth entry
in the dictionary.

 Continuing in this fashion, the encoder output and the dictionary develop as in Table
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

(b) Illustrate with example, how the compression factor LZW differ from the
LZ78(4)
Compression Ratio = Size of the original image /Size of the compressed image Using
LZW, 60-70 % of compression ratio can be achieved for monochrome images and text
files with repeated data.
16. (a) How quantization and coding helps in compression and their role in JPEG.
(6)
Quantization
After each 8×8 data unit of DCT coefficients Gij is computed, it is quantized. This is
the step where information is lost (except for some unavoidable loss because of finite
precision calculations in other steps). Each number in the DCT coefficients matrix is
divided by the corresponding number from the particular “quantization table” used, and
the result is rounded to the nearest integer. Three such tables are needed, for the three
color components. The JPEG standard allows for up to four tables, and the user can
select any of the four for quantizing each color component. The 64 numbers that
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

constitute each quantization table are all JPEG parameters. In principle, they can all be
specified and fine-tuned by the user for maximum compression. JPEG software
normally uses the following two approaches:
1. Default quantization tables. Two such tables, for the luminance (grayscale) and
the chrominance components, are the result of many experiments performed by the
JPEG committee. They are included in the JPEG standard and are reproduced here
This is how JPEG reduces the DCT coefficients with high spatial frequencies.
2. A simple quantization table Q is computed, based on one parameter R specified by
the user. A simple expression such as Qij = 1 + (i + j) × R guarantees that QCs start
small at the upper-left corner and get bigger toward the lower-right corner.

If the quantization is done correctly, very few nonzero numbers will be left in the DCT
coefficients matrix, and they will typically be concentrated in the upper-left region.
These numbers are the output of JPEG, but they are further compressed before being
written on the output stream. In the JPEG literature this compression is called “entropy
coding,”.
Three techniques are used by entropy coding to compress the 8 × 8 matrix of integers:

1. The 64 numbers are collected by scanning the matrix in zig zags . This produces a
string of 64 numbers that starts with some nonzeros and typically ends with many
consecutive zeros. Only the nonzero numbers are output (after further compressing
them) and are followed by a special end-of block (EOB) code. This way there is no
need to output the trailing zeros (we can say that the EOB is the run-length encoding of
all the trailing zeros).
2.The nonzero numbers are compressed using Huffman coding

3. The first of those numbers (the DC coefficient) is treated differently from the others
(the AC coefficients).

Coding
Each 8×8 matrix of quantized DCT coefficients contains one DC coefficient [at
position (0, 0), the top left corner] and 63 AC coefficients. The DC coefficient is a
measure of the average value of the 64 original pixels, constituting he data unit. In a
continuous-tone image the average values of the pixels in adjacent data units are close.
DC coefficient of a data unit is a multiple of the average of the 64 pixels constituting
the unit. This implies that the DC coefficients of adjacent data units don’t differ much.
JPEG outputs the first one (encoded), followed by differences (also encoded) of the DC
coefficients of consecutive data units.If the first three 8×8 data units of an image have
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

quantized DC coefficients of 1118, 1114, and 1119, then the JPEG output for the first
data unit is 1118 (Huffman encoded, see below) followed by the 63 (encoded) AC
coefficients of that data unit. The output for the second data unit will be 1114 - 1118 =
-4 (also Huffman encoded), followed by the 63 (encoded) AC coefficients of that data
unit, and the output for the third data unit will be 1119 - 1114 = 5 (also Huffman
encoded), again followed by the 63 (encoded) AC coefficients of that data unit.

17. (a) With the help of equations discuss Composite and Components Video (7)
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

(b) Differentiate the major changes in MPEG - 2 and MPEG-4 Video (7)
MPEG-2 extends the basic MPEG system to provide compression support for TV
quality transmission of digital video. To understand why video compression is so
important, one has to consider the vast bandwidth required to transmit uncompressed
digital TV pictures

Because the MPEG-2 standard provides good compression using standard algorithms,
it has become the standard for digital TV. It has the following features:

 Video compression which is backwards compatible with MPEG-1


 Full-screen interlaced and/or progressive video (for TV and Computer
displays)
 Enhanced audio coding (high quality, mono, stereo, and other audio features)
 Transport multiplexing (combining different MPEG streams in a single
transmission stream)
 Other services (GUI, interaction, encryption, data transmission, etc)

• MPEG-4 is a new standard for audiovisual data.


• finding ways to compress multimedia data to very low bitrates with minimal
distortions.
• MPEG-1 was originally developed as a compression standard for interactive video on
CDs and for digital audio broadcasting.
• interactive CDs and digital audio broadcasting have had little commercial success, so
MPEG-1 is used today for general video compression.
• One aspect of MPEG-1 that was supposed to be minor, namely MP3, has grown out
of proportion and is commonly used today for audio
• MPEG-2 was specifically designed for digital television and this standard has had
tremendous commercial success• It was supposed to deliver reasonable video data in
only a few thousand bits per second. Such compression is important for video
telephones, video conferences or for receiving video in a small, handheld device,
especially ina mobile environment, such as a moving car
• Instead of a compression standard, they decided to develop a set of tools (a toolbox)
to deal with audiovisual products in general, at present and in the future.
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

• Traditionally, methods for compressing video have been based on pixels. Each video
frame is a rectangular set of pixels, and the algorithm looks for correlations between
pixels in a frame and between frames The compression paradigm adopted for MPEG-4
is based on objects
• “coding of audiovisual objects.
• Defining objects, such as a flower, a face, or a vehicle, and then describing how each
object should be moved and manipulated in successive frames.
• A flower may open slowly, a face may turn, smile, and fade, a vehicle may move
toward the viewer and appear bigger.
• MPEG-4 includes an object description language that provides for a compact
description of both objects and their movements and interactionsAnother important
feature of MPEG-4 is interoperability.
• term refers to the ability to exchange any type of data, be it text, graphics, video, or
audio.
• Interoperability is possible only in the presence of standards.• All devices that produce
data, deliver it, and consume (play, display, or print) it must obey the same rules and
read and write the same file structures
OR
18. (a) Describe in details about functionalities for MPEG-4 (8)
Content-based multimedia access tools
• The MPEG-4 standard should provide tools for accessing and organizing audiovisual
data.
• Such tools may include indexing, linking, querying, browsing, delivering files, and
deleting them

Content-based manipulation and bitstream editing.


• A syntax and a coding scheme should be part of MPEG-4.
• The idea is to enable users to manipulate and edit compressed files (bitstreams)
without fully decompressing them.
• A user should be able to select an object and modify it in the compressed file without
decompressing the entire file.
Hybrid natural and synthetic data coding
• A natural scene is normally produced by a video camera. A synthetic scene consists
of text and graphics. MPEG-4 recognizes the need for tools to compress natural and
synthetic scenes and mix them interactively.
Improved temporal random access.
• Users may often want to access part of the compressed file, so the MPEG-4 standard
should include tags to make it easy to quickly reach any point in the file
• when the file is stored in a central location and the user is trying to manipulate it
remotely, over a slow communications channel.
Improved coding efficiency.
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

• This feature simply means improved compression.


• Imagine a case where audiovisual data has to be transmitted over a low-bandwidth
channel (such as a telephone line) and stored in a low-capacity device such as a
smartcard.
• This is possible only if the data is well compressed, and high compression rates.
• smaller image size, reduced resolution (pixels per inch), and lower quality
Coding of multiple concurrent data streams.
• Audiovisual applications will allow the user not just to watch and listen but also to
interact with the image.
• As a result, the MPEG-4 compressed stream can include several views of the same
scene, enabling the user to select any of them to watch and to change views at will.
• The point is that the different views may be similar, so any redundancy should be
eliminated by means of efficient compression that takes into account identical patterns
in the various views. The same is true for the audio part (the soundtracks).
Robustness in error-prone environments.
• MPEG-4 must provide error correcting codes for cases where audiovisual data is
transmitted through a noisy channel.
• This is especially important for low-bitrate streams, where even the smallest error may
be noticeable and may propagate and affect large parts of the audiovisual presentation.
Content-based scalability.
• The compressed stream may include audiovisual data in fine resolution and high
quality, but any MPEG-4 decoder should be able to decode it at low resolution and
low quality.
• This feature is useful in cases where the data is decoded and displayed on a small,
low-resolution screen, or in cases where the user is in a hurry and prefers to see a rough
image rather than wait for a full decoding.
(b) How Motion Compensation help in video compression (6)
• If the encoder discovers that a part P of the preceding frame has been rigidly moved
to a different location in the current frame, then P can be compressed by writing the
following three items on the compressed stream: its previous location, its current
location, and information identifying the boundaries of P.
• The encoder scans the current frame block by block. For each block B it searches the
preceding frame for an identical block C (if compression is to be lossless) or for a
similar one (if it can be lossy). Finding such a block, the encoder writes the difference
between its past and present locations on the output. This difference is of the form
(Cx − Bx, Cy − By) = (Δx, Δy) it is called a motion vector
• Motion compensation is effective if objects are just translated, not scaled or rotated.
Drastic changes in illumination from frame to frame also reduce the effectiveness of
this method. In general, motion compensation is lossy.
main aspects of motion compensation
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

 Frame Segmentation: The current frame is divided into equal-size non overlapping
blocks. The blocks may be squares or rectangles.
• large blocks reduce the chance of finding a match, and small blocks result in many
motion vectors. In practice, block sizes that are integer powers of 2, such as 8 or 16, are
used
 Search Threshold: Each block B in the current frame is first compared to its
counterpart C in the preceding frame. If they are identical, or if the difference
between them is less than a preset threshold, the encoder assumes that the block hasn’t
been moved.
 Block Search:
• This is a time-consuming process
• If B is the current block in the current frame, then the previous frame has to be
searched for a block identical to or very close to B.
• The search is normally restricted to a small area (called the search area) around B,
defined by the maximum displacement parameters dx and dy.
• These parameters specify the maximum horizontal and vertical distances, in pixels,
between B and any matching block in the previous frame.
• If B is a square with side b, the search area will contain (b + 2dx)(b + 2dy) and will
consist of (2dx + 1)(2dy + 1) distinct, overlapping b×b squares.

• The number of candidate blocks in this area is therefore proportional to dx·dy.


 Distortion Measure:
• This is the most sensitive part of the encoder. The distortion measure selects the best
match for block B. It has to be simple and fast, but also reliable
• The mean absolute difference (or mean absolute error) calculates the average of the
absolute differences between a pixel Bij in B and its counterpart Cij in a candidate block
C

The mean square difference is a similar measure, where the square, rather than the
absolute value, of a pixel difference is calculated
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

• The pel difference classification (PDC) measure counts how many differences |Bij −
Cij | are smaller than the PDC parameter p.
• The integral projection measure computes the sum of a row of B and subtracts it from
the sum of the corresponding row of C. The absolute value of the difference is added to
the absolute value of the difference of the columns sum

 Suboptimal Search Methods: These methods search some, instead of all, the
candidate blocks in the (b+2dx)(b+2dy) area. They speed up the search for a
matching block, at the expense of compression efficiency
 Motion Vector Correction:
• Once a block C has been selected as the best match for B, a motion vector is computed
as the difference between the upper-left corner of C and the upper-left corner of B.
• Regardless of how the matching was determined, the motion vector may be wrong
because of noise, local minima in the frame, or because the matching algorithm is not
perfect.
• It is possible to apply smoothing techniques to the motion vectors after they have been
calculated, in an attempt to improve the matching. Spatial correlations in the image
suggest that the motion vectors should also be correlated.
 Coding Motion Vectors: Two properties of motion vectors help in encoding them:
(1) They are correlated and (2) their distribution is nonuniform
• No single method has proved ideal for encoding the motion vectors.
• two different methods that may perform better:
• Predict a motion vector based on its predecessors in the same row and its predecessors
in the same column of the current frame. Calculate the difference between the
prediction and the actual vector, and Huffman encode it. This algorithm is important. It
is used in MPEG and other compression methods.
• Group the motion vectors in blocks. If all the vectors in a block are identical, the block
is encoded by encoding this vector. Other blocks are encoded as in 1 above. Each
encoded block starts with a code identifying its type.
 Coding the Prediction Error:
• Motion compensation is lossy, since a block B is normally matched to a somewhat
different block C.
• Compression can be improved by coding the difference between the current
uncompressed and compressed frames on a block by block basis and only for blocks
that differ much.
• The difference is written on the output, following each frame, and is used by the
decoder to improve the frame after it has been decoded.
19. (a) How The Human Auditory System limitations can be taken in audio (7)
Compressions
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

The frequency range of the human ear is from about 20 Hz to about 20,000 Hz, but the
ear’s sensitivity to sound is not uniform. It depends on the frequency, and experiments
indicate that in a quiet environment the ear’s sensitivity is maximal for frequencies in
the range 2 kHz to 4 kHz. The existence of the hearing threshold suggests an approach
to lossy audio compression. Just delete any audio samples that are below the threshold.

The frequency range of the human ear is from about 20 Hz to about 20,000 Hz, but the
ear’s sensitivity to sound is not uniform. It depends on the frequency. The existence of
the hearing threshold suggests an approach to lossy audio compression. Just delete any
audio samples that are below the threshold. Since the threshold depends on the
frequency, the encoder needs to know the frequency spectrum of the sound being
compressed at any time.
The range of audible frequencies can therefore be partitioned into a number of critical
bands that indicate the declining sensitivity of the ear (rather, its declining resolving
power) for higher frequencies. We can think of the critical bands as a measure similar
to frequency. However, in contrast to frequency, which is absolute and has nothing to
do with human hearing, the critical bands are determined according to the sound
perception of the ear. Thus, they constitute a perceptually uniform measure of
frequency.
two more properties of the human hearing system are used in audio compression. They
are frequency masking and temporal masking.Frequency masking (also known as
auditory masking) occurs when a sound that we can normally hear (because it is loud
enough) is masked by another sound with a nearby frequency. This source raises the
normal threshold in its vicinity (the dashed curve), with the result that the nearby sound
represented by t “x”, a sound that would normally be audible because it is above the
threshold, is now masked, and is inaudible .A good lossy audio compression method
should identify this case and delete the signals corresponding to sound “x”, because it
cannot be heard anyway. This is one way to lossily compress sound.The frequency
masking depends on the frequency. It varies from about 100 Hz for the lowest audible
frequencies to more than 4 kHz for the highest.
Temporal masking may occur when a strong sound A of frequency f is preceded or
followed in time by a weaker sound B at a nearby (or the same) frequency. If the time
interval between the sounds is short, sound B may not be audible.
(b) Discuss the complexity of Layer III compared to others in MPEG Audio
Coding(7)
In layer 1 and 2 encoding schemes, subbands at lower frequencies have a wide
bandwidth than the critical bands. Makes it difficult to accurately judge the mask-to-
signal ratio. A simple way to increase the spectral resolution would be to decompose
the signal directly into a higher number of bands.The spectral decomposition in the
Layer III algorithm is performed in two stages.
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

1. 32-band subband decomposition used in Layer I and Layer II is employed.


2. The output of each subband is transformed using a modified discrete cosine
transform (MDCT) with a 50% overlap.

The Layer III algorithm specifies two sizes for the MDCT, 6 or 18.

OR
20. (a) Discuss Format of Compressed Data and encoding in layer I and II (9)

(b) Differentiate Spectral and Temporal Masking (5)


Spectral Masking
● To hide quantization noise, we can make use of the fact that signals below a particular
amplitude at a particular frequency are not audible.
● If we select the quantizer step size such that the quantization noise lies below the
audibility
threshold, the noise will not be perceived.
● Furthermore, the threshold of audibility is not absolutely fixed and typically rises
when
multiple sounds impinge on the human ear. This phenomenon gives rise to spectral
masking.
● A tone at a certain frequency will raise the threshold in a critical band around that
frequency.
These critical bands have a constant Q, which is the ratio of frequency to bandwidth.
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
EIGHTH SEMESTER B.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: CST446
Course Name: Data Compression Techniques
Max.Marks:100 Duration: 3 Hours

● Thus, at low frequencies the critical band can have a bandwidth as low as 100 Hz,
while at
higher frequencies the bandwidth can be as large as 4 kHz. This increase of the
threshold has major implications for compression. Here a tone at 1 kHz has raised the
threshold of audibility so that the adjacent tone above it in frequency is no longer
audible.
● At the same time, while the tone at 500 Hz is audible, because of the increase in the
threshold the tone can be quantized more crudely.
● This is because increase of the threshold will allow us to introduce more quantization
noise at that frequency.

Temporal Masking
The temporal masking effect is the masking that occurs when a sound raises the
audibility threshold for a brief interval preceding and following the sound.
● In Figure 16.3 we show the threshold of audibility close to a masking sound. Sounds
that occur in an interval around the masking sound (both after and before the masking
tone) can be masked.
If the masked sound occurs prior to the masking tone, this is called premasking or
backward masking, and if the sound being masked occurs after the masking tone this
effect is called postmasking or forward masking.
● The forward masking remains in effect for a much longer time interval than the
backward masking.

You might also like