0% found this document useful (0 votes)
7 views

Image Compression

The document discusses image compression techniques. It covers lossless compression methods like JPEG, CALIC and JPEG-LS. It explains the need for image compression to reduce bandwidth requirements for image transmission. Key aspects of images like high compression ratio, redundancy and human visual tolerance are also covered.

Uploaded by

d56468901
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Image Compression

The document discusses image compression techniques. It covers lossless compression methods like JPEG, CALIC and JPEG-LS. It explains the need for image compression to reduce bandwidth requirements for image transmission. Key aspects of images like high compression ratio, redundancy and human visual tolerance are also covered.

Uploaded by

d56468901
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

UNIT – 3 Image Compression:

Lossless techniques of image compression, gray codes, two-dimensional image transform,


Discrete cosine transform and its application in lossy image compression, quantization, Zig-
Zag coding sequences, JPEG and JPEG-LS compression standards, pulse code modulation and
differential pulse code modulation methods of image compression, video compression and
MPEG industry standard.

Need of image compression:


A picture is equivalent to about five million words. That means transmitting 5 million words
is equivalent to transmitting a picture. In short, we need to transmit a large amount of data to
transmit a picture.

Therefore the transmission of a picture with high quality requires a large transmission
bandwidth. But with so many TV channels to be transmitted simultaneously we cannot afford
to allot such a large bandwidth for each TV channel.

A picture is worth a thousand words. For example, 1000 words contain 6000 characters. If each
character is coded into 7-bit ASCII symbols then we require 6000 x 7 i.e. 42000 bits, for
transmitting 1000 words.
What will be the size of a picture that can be described with 42000 bits?

As shown in Fig. an image is formed with the picture elements called pixels. Generally a
medium quality picture is formed with 300 pixels per inch.

With existing standards the 42,000 bits will be able to describe a picture of size 1/4 inch i.e. a
very small picture. If the picture size is increased then the number of pixels will increase and
hence the number of bits will also increase. A picture of 8.5 inch by 11.0 inch with 300 pixels
per inch would require about 2 x 10 bits.

If such a picture is to be transmitted, then we will require a bandwidth which is approximately


half the number of bits. Thus BW requirement increases with increase in the size of picture.
The bandwidth requirement further increases if the quality of image is to be improved. This
much channel bandwidth is practically impossible to be allotted.

Therefore a technique called image compression is invented to reduce the bandwidth


requirement without compromising the quality of image.

Alipta Anil Pawar


Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
In the image compression system, high resolution colour printers, scanners, cameras and
monitors are used to capture and present high quality images for advertisement and
entertainment.

To transmit this image, we have to use some kind of source coding technique to reduce
bandwidth and memo requirements. Therefore source coding (PCM (Pulse-code modulation),
DM (delta modulation), DPCM (Differential pulse code modulation) or ADM (Alternating
Direction Method)) has to be used to reduce the bandwidth and memory requirement for
transmission of an image.

Standards of Image compression:

The two standards used for image compression are:

1. Joint Photographic Experts Group (JPEG).

2. Motion Picture Experts Group (MPEG).

Characteristics of Images:

Some of the important characteristics of images are as follows:

1) High compression ratio: Since a very large number of bits is required for the
representation of an image, it is necessary to use extremely high compression ratios to
make storage and transmission practically possible.

2) If the transmission of moving images is to be done (examples are TV, movies, computer
graphics, WWW etc.) then it should be executed very fast.

3) Redundancy: The most important characteristics of images is the redundancy. That


means the adjacent horizontal lines in a picture contain nearly identical information.

4) Human eyes are highly tolerant to the approximation errors in an image. This makes
the compression practically possible. Such a compression is called is the lossy
compression.

Principle of Image compression:

- We can compress video by compressing images. The two standards used for image
compression are:
1. Joint Photographic Experts Group (JPEG).
2. Moving Picture Experts Group (MPEG).

Alipta Anil Pawar


Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
- Out of these the JPEG is used to compress still images whereas the MPEG is used for
compressing moving pictures.
Block diagram:
Fig. shows the general principle of image compression.

Description:
- The image that is to be transmitted is first converted into an uncompressed digital
image by the process of digitization.
- This digital image is applied to an encoder that uses an appropriate image compression
technique to compress the digital image.
- The compressed digital image is transmitted over a suitable communication channel.
- At the receiving end, a decoder will uncompressed the received compressed image and
passes it on to the receiver to display the image.

Lossless techniques of image compression:


1. The Old JPEG Standard
2. CALIC
3. JPEG-LS

1. The Old JPEG Standard:


The Joint Photographic Experts Group (JPEG) is a joint ISO/ITU committee responsible for
developing standards for continuous-tone still-picture coding. The more famous standard
produced by this group is the lossy image compression standard. However, at the time of the
creation of the famous JPEG standard, the committee also created a lossless standard.
The old JPEG lossless still compression standard provides eight different predictive schemes
from which the user can select. The first scheme makes no prediction. The next seven are listed
below. Three of the seven are one-dimensional predictors, and four are two-dimensional
prediction schemes. Here, I(i,j) is the (i,j) th pixel of the original image, and (i,j) is the
predicted value for the (i,j) th pixel.

Alipta Anil Pawar


Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
Different images can have different structures that can be best exploited by one of these eight
modes of prediction. If compression is performed in a non-real-time environment—for
example, for the purposes of archiving—all eight modes of prediction can be tried and the one
that gives the most compression is used. The mode used to perform the prediction can be stored
in a 3-bit header along with the compressed file. We encoded our four test images using the
various JPEG modes. The residual images were encoded using adaptive arithmetic coding. The
results are shown in Table 7.1.
The best results—that is, the smallest compressed file sizes—are indicated in bold in the table.
From these results we can see that a different JPEG predictor is the best for the different images.
In Table 7.2, we compare the best JPEG results with the file sizes obtained using GIF and PNG.
Note that PNG also uses predictive coding with four possible predictors, where each row of the
image can be encoded using a different predictor.
Even if we take into account the overhead associated with GIF, from this comparison we can
see that the predictive approaches are generally better suited to lossless image compression
than the dictionary-based approach when the images are “natural” gray-scale images. The
situation is different when the images are graphic images or pseudocolor images.

A possible exception could be the Earth image. The best compressed file size using the second
JPEG mode and adaptive arithmetic coding is 32,137 bytes, compared to 34,276 bytes using
GIF. The difference between the file sizes is not significant. We can see the reason by looking
at the Earth image. Note that a significant portion of the image is the background, which is of
a constant value. In dictionary coding, this would result in some very long entries that would
provide significant compression.
Alipta Anil Pawar
Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
We can see that if the ratio of background to foreground were just a little different in this image,
the dictionary method in GIF might have outperformed the JPEG approach. The PNG approach
which allows the use of a different predictor (or no predictor) on each row, prior to dictionary
coding significantly outperforms both GIF and JPEG on this image.

2. CALIC
CALIC method uses both the prediction and context of the pixel value. This scheme works
in two modes, one for gray scale images and another for bi-level images. Here we will
discuss the compression of gray-scale images. A given pixel has a value close to any one
of its neighbours, in an image. A neighbour which has the closest value depends on the
structure (local) of the image.
Depending on whether there is a horizontal or vertical edge in the neighbourhood of the
pixel being encoded, the pixel above, or the pixel to the left, or some weighted average of
neighbouring pixels may give the best prediction. How close the prediction is to the pixel
being encoded depends on the surrounding texture. In a region of the image with a great
deal of variability, the prediction is likely to be further from the pixel being encoded than
in the regions with less variability.
In order to take into account all these factors, the algorithm has to make a determination of
the environment of the pixel to be encoded. The only information that can be used to make
this determination has to be available to both encoder and decoder.

Let’s take up the question of the presence of vertical or horizontal edges in the
neighbourhood of the pixel being encoded. To help our discussion, we will refer to Figure.
In this figure, the pixel to be encoded has been marked with an X. The pixel above is called
the north pixel, the pixel to the left is the west pixel, and so on. Note that when pixel X is
being encoded, all other marked pixels (N_ W_ NW_ NE_ WW_ NN_ NE, and NNE) are
available to both encoder and decoder.

Alipta Anil Pawar


Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
We can get an idea of what kinds of boundaries may or may not be in the neighbourhood of X
by computing
dh = |W –WW|+|N –NW|+|NE−N|
dv = |W –NW|+|N –NN|+|NE−NNE|
The relative values of dh and dv are used to obtain the initial prediction of the pixel X. This
initial prediction is then refined by taking other factors into account. If the value of dh is much
higher than the value of dv, this will mean there is a large amount of horizontal variation, and
it would be better to pick N to be the initial prediction. If, on the other hand, dv is much larger
than dh, this would mean that there is a large amount of vertical variation, and the initial
prediction is taken to be W. If the differences are more moderate or smaller, the predicted value
is a weighted average of the neighbouring pixels.
The exact algorithm used by CALIC to form the initial prediction is given by the following
pseudocode:

Using the information about whether the pixel values are changing by large or small amounts
in the vertical or horizontal direction in the neighbourhood of the pixel being encoded provides
a good initial prediction. In order to refine this prediction, we need some information about the
interrelationships of the pixels in the neighbourhood. Using this information, we can generate
an offset or refinement to our initial prediction. We quantify the information about the
neighbourhood by first forming the vector
[N, W, NW, NE, NN, WW, 2N –NN, 2W –WW]
Alipta Anil Pawar
Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
We then compare each component of this vector with our initial prediction . If the value of
the component is less than the prediction, we replace the value with a 1; otherwise we replace
it with a 0. Thus, we end up with an eight-component binary vector. If each component of the
binary vector was independent, we would end up with 256 possible vectors. However, because
of the dependence of various components, we actually have 144 possible configurations. We
also compute a quantity that incorporates the vertical and horizontal variations and the previous
error in prediction by
δ= dh+dv+2 |N − |
Where ˆN is the predicted value of N. This range of values of δ is divided into four intervals,
each being represented by 2 bits. These four possibilities, along with the 144 texture
descriptors, create 144×4 = 576 contexts for X. As the encoding proceeds, we keep track of
how much prediction error is generated in each context and offset our initial prediction by that
amount. This results in the final predicted value.
Once the prediction is obtained, the difference between the pixel value and the prediction (the
prediction error, or residual) has to be encoded. While the prediction process outlined above
removes a lot of the structure that was in the original sequence, there is still some structure left
in the residual sequence. We can take advantage of some of this structure by coding the residual
in terms of its context. The context of the residual is taken to be the value of δ defined in
Equation. In order to reduce the complexity of the encoding, rather than using the actual value
as the context, CALIC uses the range of values in which δ lies as the context. Thus:

The values of q1–q8 can be prescribed by the user.


If the original pixel values lie between 0 and M−1, the differences or prediction residuals will
lie between –(M −1) and M −1. Even though most of the differences will have a magnitude
close to zero, for arithmetic coding we still have to assign a count to all possible symbols. This
means a reduction in the size of the intervals assigned to values that do occur, which in turn
means using a larger number of bits to represent these values. The CALIC algorithm attempts
to resolve this problem in a number of ways. Let’s describe these using an example.
Consider the sequence
Xn : 0, 7,4, 3, 5, 2,1,7
We can see that all the numbers lie between 0 and 7, a range of values that would require 3 bits
to represent. Now suppose we predict a sequence element by the previous element in the
sequence. The sequence of differences
Alipta Anil Pawar
Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
rn= xn−xn-1
is given by
rn: 0, 7,−3, −1, 2, −3, −1,6
If we were given this sequence, we could easily recover the original sequence by using
Xn = xn-1+ rn
However, the prediction residual values rn lie in the [−7, 7] range. That is, the alphabet required
to represent these values is almost twice the size of the original alphabet. However, if we look
closely we can see that the value of rn actually lies between −xn-1and 7−xn-1
The smallest value that rn can take on occurs when xn has a value of 0, in which case rn will
have a value of −xn-1. The largest value that rn can take on occurs when xn is 7, in which case
rn has a value of 7−xn-1. In other words, given a particular value for xn-1, the number of different
values that rn can take on is the same as the number of values that xn can take on. Generalizing
from this, we can see that if a pixel takes on values between 0 and M −1, then given a predicted
value , the difference X− will take on values in the range − to M −1− . We can use this
fact to map the difference values into the range
[0, M −1], using the following mapping:
0→0
1→1
−1 → 2
2→3
: :
: :
: :
− →2

+1 → 2 +1
+2 → 2 +2
: :
: :
: :
M −1− → M −1
Where we have assumed that ≤ [M −1]/2.
Another approach used by CALIC to reduce the size of its alphabet is to use a modification of
a technique called recursive indexing. Recursive indexing is a technique for representing a
large range of numbers using only a small set. It is easiest to explain using an example. Suppose
we want to represent positive integers using only the integers between 0 and 7—that is, a
representation alphabet of size 8. Recursive indexing works as follows: If the number to be
represented lies between 0 and 6, we simply represent it by that number.
If the number to be represented is greater than or equal to 7, we first send the number 7, subtract
7 from the original number, and repeat the process. We keep repeating the process until the
remainder is a number between 0 and 6. Thus, for example, 9 would be represented by 7
Alipta Anil Pawar
Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
followed by a 2, and 17 would be represented by two 7s followed by a 3. The decoder, when it
sees a number between 0 and 6, would decode it at its face value, and when it saw 7, would
keep accumulating the values until a value between 0 and 6 was received.
This method of representation followed by entropy coding has been shown to be optimal for
sequences that follow a geometric distribution.
In CALIC, the representation alphabet is different for different coding contexts. For each
coding context k, we use an alphabet Ak= {0,1, ……., Nk}. Furthermore, if the residual occurs
in context k, then the first number that is transmitted is coded with respect to context k; if
further recursion is needed, we use the k+1 context.
We can summarize the CALIC algorithm as follows:
1. Find initial prediction .
2. Compute prediction context.
3. Refine prediction by removing the estimate of the bias in that context.
4. Update bias estimate.
5. Obtain the residual and remap it so the residual values lie between 0 and M−1, where
M is the size of the initial alphabet.
6. Find the coding context k.
7. Code the residual using the coding context.
All these components working together have kept CALIC as the state of the art in lossless
image compression. However, we can get almost as good a performance if we simplify some
of the more involved aspects of CALIC.

3. JPEG-LS

JPEG-LS compression standards


The JPEG-LS standard looks more like CALIC than the old JPEG standard. When the initial
proposals for the new lossless compression standard were compared, CALIC was rated first in
six of the seven categories of images tested. Motivated by some aspects of CALIC, a team from
Hewlett-Packard proposed a much simpler predictive coder, under the name LOCO-I (for low
complexity), that still performed close to CALIC.
As in CALIC, the standard has both a lossless and a lossy mode. We will not describe the lossy
coding procedures.
The initial prediction is obtained using the following algorithm:

Alipta Anil Pawar


Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
This prediction approach is a variation of Median Adaptive Prediction, in which the predicted
value is the median of the N, W, and NW pixels. The initial prediction is then refined using the
average value of the prediction error in that particular context.
The contexts in JPEG-LS also reflect the local variations in pixel values. However, they are
computed differently from CALIC. First, measures of differences D1, D2, and D3 are computed
as follows:
D1 = NE−N
D2 = N −NW
D3 = NW –W
The values of these differences define a three-component context vector Q. The components
of Q (Q1, Q2, and Q3) are defined by the following mappings:

Where T1, T2, and T3 are positive coefficients that can be defined by the user. Given nine
possible values for each component of the context vector, this results in 9×9×9 = 729 possible
contexts. In order to simplify the coding process, the number of contexts is reduced by
replacing any context vector Q whose first nonzero element is negative by −Q. Whenever this
happens, a variable SIGN is also set to −1; otherwise, it is set to +1. This reduces the number
of contexts to 365. The vector Q is then mapped into a number between 0 and 364.
(The standard does not specify the particular mapping to use.)
The variable SIGN is used in the prediction refinement step. The correction is first multiplied
by SIGN and then added to the initial prediction.
The prediction error rn is mapped into an interval that is the same size as the range occupied by
the original pixel values. The mapping used in JPEG-LS is as follows:

Finally, the prediction errors are encoded using adaptively selected codes based on Golomb
codes, which have also been shown to be optimal for sequences with a geometric distribution.
In Table we compare the performance of the old and new JPEG standards and CALIC. The
Alipta Anil Pawar
Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
results for the new JPEG scheme were obtained using a software implementation courtesy of
HP.

We can see that for most of the images the new JPEG standard performs very close to CALIC
and outperforms the old standard by 6% to 18%. The only case where the performance is not
as good is for the Omaha image. While the performance improvement in these examples may
not be very impressive, we should keep in mind that for the old JPEG we are picking the best
result out of eight. In practice, this would mean trying all eight JPEG predictors and picking
the best. On the other hand, both CALIC and the new JPEG standard are single-pass algorithms.
Furthermore, because of the ability of both CALIC and the new standard to function in multiple
modes, both perform very well on compound documents, which may contain images along
with text.

Gray codes:
The reflected binary code or Gray code is an ordering of the binary numeral system such that
two successive values differ in only one bit (binary digit). Gray codes are very useful in the
normal sequence of binary numbers generated by the hardware that may cause an error or
ambiguity during the transition from one number to the next. So, the Gray code can eliminate
this problem easily since only one bit changes its value during any transition between two
numbers.

Gray code is not weighted that means it does not depends on positional value of digit. This
cyclic variable code that means every transition from one value to the next value involves only
one bit change.

This is very simple method to get Gray code from Binary number. These are following steps
for n-bit binary numbers −

 The most significant bit (MSB) of the Gray code is always equal to the MSB of the
given Binary code.
 Other bits of the output Gray code can be obtained by XORing binary code bit at the
index and previous index.

Alipta Anil Pawar


Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
Discrete cosine transform and its application in lossy image compression:

Types of DCT:
1. One-dimensional DCT
2. Two-dimensional DCT
In multimedia compression we use two-dimensional DCT

1. One-dimensional DCT:
The DCT in one dimension is given by

The input is a set of n data values (pixels, audio samples, or other data) and the output
is a set of n DCT transform coefficients (or weights) . The first coefficient is called
the DC coefficient and the rest are referred to as the AC coefficients (these terms have
been inherited from electrical engineering, where they stand for “direct current” and
“alternating current”). Notice that the coefficients are real numbers even if the input data
consists of integers. Similarly, the coefficients may be positive or negative even if the
input data consists of nonnegative numbers only. This computation is straightforward but
slow. The decoder inputs the DCT coefficients in sets of n and uses the inverse DCT
(IDCT) to reconstruct the original data values (also in groups of n). The IDCT in one
dimension is given by

Alipta Anil Pawar


Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
The important feature of the DCT, the feature that makes it so useful in data compression,
is that it takes correlated input data and concentrates its energy in just the first few
transform coefficients. If the input data consists of correlated quantities, then most of the
n transform coefficients produced by the DCT are zeros or small numbers, and only a
few are large (normally the first ones). We will see that the early coefficients contain the
important (low-frequency) image information and the later coefficients contain the less-
important (high-frequency) image information. Compressing data with the DCT is
therefore done by quantizing the coefficients. The small ones are quantized coarsely
(possibly all the way to zero) and the large ones can be quantized finely to the nearest
integer. After quantization, the coefficients (or variable-size codes assigned to the
coefficients) are written on the compressed stream. Decompression is done by
performing the inverse DCT on the quantized coefficients. This results in data items that
are not identical to the original ones but are not much different.

2. Two-dimensional DCT:
The DCT in one dimension can be used to compress one-dimensional data, such as
audio samples. This chapter, however, discusses image compression which is based on
the two-dimensional correlation of pixels (a pixel tends to resemble all its near
neighbours, not just those in its row). This is why practical image compression methods
use the DCT in two dimensions. This version of the DCT is applied to small parts (data
blocks) of the image. It is computed by applying the DCT in one dimension to each row
of a data block, then to each column of the result. Because of the special way the DCT
in two dimensions is computed, we say that it is separable in the two dimensions.
Because it is applied to blocks of an image, we term it a “blocked transform.” It is
defined by

For 0 ≤ i ≤ n−1 and 0 ≤ j ≤ m−1 and for Ci and Cj defined by Equation (4.13). The first
coefficient G00 is again termed the “DC coefficient” and the remaining coefficients are
called the “AC coefficients.”
The image is broken up into blocks of n×m pixels pxy (with n = m = 8 typically), and
Equation (4.15) is used to produce a block of n×m DCT coefficients Gij for each block
of pixels. The coefficients are then quantized, which results in lossy but highly efficient
compression. The decoder reconstructs a block of quantized data values by computing
the IDCT whose definition is

Alipta Anil Pawar


Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
for 0 ≤ x ≤ n − 1 and 0 ≤ y ≤ m − 1. We now show one way to compress an entire image
with the DCT in several steps as follows:

1. The image is divided into k blocks of 8×8 pixels each. The pixels are denoted by pxy.
If the number of image rows (columns) is not divisible by 8, the bottom row (rightmost
column) is duplicated as many times as needed.
2. The DCT in two dimensions [Equation (4.15)] is applied to each block Bi. The result
( ) ()
is a block (we’ll call it a vector) of 64 transform coefficients (where j = 0, 1,
()
. . . , 63). The k vectors become the rows of matrix W

( ) ( ) ( ) ( )
3. The 64 columns of W are denoted by , ,..., . The k elements of
( ) ( ) ( ) ( )
are, ( , , …… ). The first coefficient vector consists of the k DC
coefficients.
4. Each vector ( ) is quantized separately to produce a vector ( ) of quantized
coefficients (JPEG does this differently). The elements of ( ) are then written on the
compressed stream. In practice, variable-size codes are assigned to the elements, and
the codes, rather than the elements themselves, are written on the compressed stream.
Sometimes, as in the case of JPEG, variable-size codes are assigned to runs of zero
coefficients, to achieve better compression.
In practice, the DCT is used for lossy compression. For lossless compression (where
the DCT coefficients are not quantized) the DCT is inefficient but can still be used, at
least theoretically, because (1) most of the coefficients are small numbers and (2) there
often are runs of zero coefficients. However, the small coefficients are real numbers,
not integers, so it is not clear how to write them in full precision on the compressed
stream and still have compression. Other image compression methods are better suited
for lossless image compression.

Alipta Anil Pawar


Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
Limitations of JPEG Standard in lossy image compression:

1. JPEG offers an excellent quality at high and mid bit rates. But at low bit rates the quality
of JPEG is unacceptable (e.g. below 0.25 bits per pixel).
2. JPEG cannot provide a superior performance at lossless and lossy compression.
3. The current JPEG standard provides some resynchronization markers, but the quality
still degrades when bit errors are encountered.
4. JPEG was optimized for natural images. Therefore its performance on computer
generated images and bi-level text images is poor.
5. Every next step in compressing the JPEG image degrades its quality.

Advantages of JPEG in lossy image compression:


1. JPEG is extremely portable.
2. JPEG is compatible with almost every image processing applications.
3. It is easy to print JPEG images.
4. JPEG images can be stored quickly from a camera to storage device.
5. Size of JPEG images can be reduced and compressed which makes this file format
suitable for transferring images over the internet because it consumes less bandwidth.
6. JPEG image can be compressed down to 5% of its original size.

Applications of JPEG in lossy image compression:


1. Digital photography.
2. Internet imaging.
3. Image and video editing.
4. Security video cameras.
5. Medical image compression.
6. High speed video capture.

Quantization:
After each 8×8 matrix of DCT coefficients Gij is calculated, it is quantized. This is the step
where the information loss (except for some unavoidable loss because of finite precision
calculations in other steps) occurs. Each number in the DCT coefficients matrix is divided by
the corresponding number from the particular “quantization table” used, and the result is
rounded to the nearest integer. As has already been mentioned, three such tables are needed,
for the three color components. The JPEG standard allows for up to four tables, and the user
can select any of the four for quantizing each color component. The 64 numbers that constitute
each quantization table are all JPEG parameters. In principle, they can all be specified and fine-
tuned by the user for maximum compression. In practice, few users have the patience or
expertise to experiment with so many parameters, so JPEG software normally uses the
following two approaches:

Alipta Anil Pawar


Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
1. Default quantization tables. Two such tables, for the luminance (grayscale) and the
chrominance components, are the result of many experiments performed by the JPEG
committee. They are included in the JPEG standard and are reproduced here as Table
4.64. It is easy to see how the QCs in the table generally grow as we move from the
upper left corner to the bottom right one. This is how JPEG reduces the DCT
coefficients with high spatial frequencies.
2. A simple quantization table Q is computed, based on one parameter R specified by the
user. A simple expression such as Qij = 1+ (i + j) × R guarantees that QCs start small
at the upper-left corner and get bigger toward the lower-right corner. Table 4.65 shows
an example of such a table with R = 2.
If the quantization is done correctly, very few nonzero numbers will be left in the DCT
coefficients matrix, and they will typically be concentrated in the upper-left region. These
numbers are the output of JPEG, but they are further compressed before being written on the
output stream. In the JPEG literature this compression is called “entropy coding,” and Section
4.8.4 shows in detail how it is done. Three techniques are used by entropy coding to compress
the 8 × 8 matrix of integers:

Alipta Anil Pawar


Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
Zig- Zag coding sequences:
The 64 numbers are collected by scanning the matrix in zigzags (Figure 1.8b). This produces
a string of 64 numbers that starts with some nonzeros and typically ends with many consecutive
zeros. Only the nonzero numbers are output (after further compressing them) and are followed
by a special end-of block (EOB) code. This way there is no need to output the trailing zeros
(we can say that the EOB is the run length encoding of all the trailing zeros).
There is a simple, practical way to write a loop that traverses an 8 × 8 matrix in zigzag. First
figure out the zigzag path manually, then record it in an array zz of structures, where each
structure contains a pair of coordinates for the path as shown,
e.g., in Figure 4.66.

Alipta Anil Pawar


Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
Pulse code modulation and differential pulse code modulation methods of
image Compression:
- In a PCM system, the signal x (t) is sampled at a rate which is slightly higher than the
Nyquist rate.
- It is observed that the resulting sampled signal has a high correlation between the
adjacent samples.
- That means there is a correlation between the adjacent samples. It is observed that
generally the signal x (t) does not change rapidly from one sample to the next.
- Therefore the difference in amplitudes of adjacent samples is very small, as shown in
Fig. 2.8.1.
- When these highly correlated samples are encoded using a standard PCM system, the
resulting encoded PCM signal contains redundant information. Redundant bits do not
contain any new information.
- By removing this redundancy before encoding, we can obtain a more efficiently coded
signal. The DPCM system operates on this principle. In DPCM system a special circuit
called "predictor" is used.
- The "predictor" can actually predict the values of the future samples of x (t). This helps
in reducing the redundancy.

Role of a Predictor:
- It is observed that if the sampling takes place at a rate which is higher than the Nyquist
rate, then there is a correlation between successive samples of the signal x (t).
- Hence we can predict the range of next required increment or decrement in x (t) at the
predictor output, if we know the past sample value or the difference.
- This reduces the difference or error between x (t) and ( ). Therefore to encode this
small value of error the DPCM system requires less number of bits which will
ultimately reduce the bit rate. This is the role predictor in DPCM system.

Alipta Anil Pawar


Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
DPCM Transmitter:
- The DPCM transmitter block diagram is shown below:

- Suppose that a baseband signal x (t) is sampled at a rate fs = 1/ Ts to produce the sampled
signal {x(n ) }. This signal acts as the input signal to the DPCM transmitter.
- Let the sequence of such samples be denoted by {x(n ) } where n is an integer.
- Let the predictor produce a predicted version of the sampled input and let the predictor
output be denoted by (n ).
- The predictor output is subtracted from the sampled input to obtain a difference signal
e (n ) as follows:
e (n )= x (n ) - (n )
- The predictor value (n ) is produced by the predictor whose input consists of
quantized version of input signal x (n ).
- The difference signal e (n ) is called as prediction error, because it represents the
difference between the sample and its predicted value.
- The quantizer output v (n ) is encoded to obtain the digital pulses i.e. DPCM signal.
- Let the input output characteristics of the quantizer be denoted by a nonlinear function
Q (·).
- So referring to Fig.2.8.2 we get the quantizer output as.

v(n ) = Q[e(n )]
= e(n ) + q (n )
Where q (n ) is the quantization error.

- Referring to Fig. 2.8.2 the predictor input is given by.

u(n ) = (n ) + v(n )
- Substituting the expression for v (n ) we get

u(n ) = (n ) + e (n ) + q (n )
- But (n ) + e (n ) = x (n )
∴ u(n ) = x (n ) + q (n )
This is nothing but quantized version of input x (n )
Alipta Anil Pawar
Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
- Thus the quantized signal u (n ) at the predictor input differ from the original input
signal by q (n ) i.e. the quantization error.

DPCM Receiver:

The block diagram of a DPCM receiver is shown in Fig. 2.8.3.

- The DPCM signal is applied to the decoder for reconstructing the quantized version of
the input.
- The decoder output is actually the reconstructed quantized error signal.
- This signal is then added to the predictor output to produce the original signal.
- The predictor used at the receiver is same as that at the transmitter.

Receiver output = e (n ) + (n )
= x(n )

- This is same as the input signal applied to the transmitter.

Video compression:

A video is a sequence of images. They are displayed at a constant rate of 24 or 30 images per
second. Therefore we can compress video by compressing images. The two standards used for
image compression are:

1. Joint Photographic Experts Group (JPEG).

2. Moving Picture Experts Group (MPEG).

Out of these the JPEG is used to compress still images whereas the MPEG is used for
compressing moving pictures.

MPEG industry standard:


Principle:
- MPEG is another compression standard. Long form of MPEG is Motion Picture Experts
Group Standards.
Alipta Anil Pawar
Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
- These algorithms are standardized to compress moving pictures.
- Digital motion video compression can be accomplished with the JPEG still image
standard if you have fast enough hardware to process 30 images per second.
- However, the maximum compression potential cannot be achieved because the
redundancy between frames is not being fully exploited by the JPEG standard.
- Furthermore, there are many other things to be considered in compressing and
decompressing motion video, as indicated in the objectives.

Objectives of MPEG:

- As with the JPEG standard, the MPEG standard is intended to be generic, meaning that
it will support the needs of many applications.
- As such, it can be considered as a motion video compression toolkit, from which a user
selects the particular features that applications. More specific objectives are:
1. The standard will deliver acceptable video quality at compressed data rates between
1.0 and 1.5 Mbps.
2. It will support either symmetric or asymmetric compress/decompress applications.
3. When compression takes it into account, random-access playback is possible to any
specified degree.
4. Similarly, when compression takes it into account, fast-forward, fast-reverse, or
normal-reverse playback modes can be made available in addition to normal
(forward) playback.
5. Audio/Video synchronization will be maintained.
6. Catastrophic behaviour in the presence of data errors should be avoidable.
7. When it is required, compression-decompression delay can be controlled.
8. Editability should be available when required by the application.
9. There should be sufficient format flexibility to support playing of video in windows.
10. The processing requirements should not preclude the development of low-cost
chipsets, which are capable of encoding in real-time.)
- As you can see, some of these objectives are conflicting, and they all conflict with the
objectives of cost and quality.
- In spite of that the proposed standard provides for all of the objectives, but of course
not all at once.
- A proposed application has to make its own choices about which features of the
standard it requires and accept any trade-off that this may cause.

Various MPEG standards:

The MPEG compression standards are among the most popular compression techniques.
Various MPEG standards are as follows:

1. MPEG 1: For CD-ROM quality video (1.5 Mbps).


2. MPEG 2: For high quality DVD video (3-6 Mbps).
3. MPEG 4: For object oriented video compression.
Alipta Anil Pawar
Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
The H.261 video compression standards are also very popular in the Internet.

MPEG 1:

- MPEG-1 was the first procedure for video compression only for those systems which
used progressive scanning. MPEG-1 was not meant for interlaced scanning.
- The compression ratio used in MPEG-1 is 100: 1 That means the original signal of 150
Mbps can be compressed to 1.5 Mbps using MPEG-1
- MPEG-1 standard was first published in 1993 which also supported the two channel
stereo application.
- MPEG-1 audio layer 3 is also known as MP3. Do not confuse it with MPEG-3.

Main features of MPEG-1:

1. It is a lossy compression standard for audio and video.


2. compresses, digital video and CD audio down to 1.5 Mbps
3. Compression ratios of 26:1 and 6:1 can be achieved for video and audio respectively.
4. It supports resolution upto 4095 x 4095 (12 bits) and a bit rate upto 100 Mbps.
5. It does not support surround sound, interlaced scanning and HDTV.
6. It can support only one chroma subsampling Le.4:2:0.

Limitations of MPEG-1:

1. It is not suitable for surround sound system. It supports only the two channel stereo.
2. It cannot be applied to the interlaced scanning (TV).
3. It cannot be used for HDTV.
4. It is an audio compression system limited to only two channels (stereo).
5. It provides very poor compression when used for interlaced video.
6. It is not suitable for higher resolution videos.
7. MPEG-1 can support only one chroma subsampling i.e. 4:2:0.

MPEG-2:
- MPEG-2 has evolved out of the shortcomings of MPEG-1.
- MPEG-2 should not be confused with MPEG-1 Audio Layer II (MP2).
- MPEG-2 is also called as H.262 as defined by ITU. It is a standard for "The generic
coding of moving pictures” and the associated audio information.

Description:

- The key techniques used in MPEG-2 codecs include intraframe Discrete Cosine
Transform (DCT) coding and motion compensated interframe prediction.
- The MPEG-2 standard allows the encoding of video over a wide range of resolutions,
including higher resolutions commonly known as HDTV.
Alipta Anil Pawar
Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
- MPEG-2 is a combination of lossy video compression and lossy audio compression
methods, based on motion vector estimation, discrete cosine transform (DCT),
quantization and Huffman encoding.
- MPEG-2 is not as efficient as the newer standards such as H.264/AVC and
H.265/HAVC, it is still widely used in over the air transmission of Digital TV and in
the DVD-Video standard.

Main Characteristics of MPEG-2:

- MPEG-2 widely used as the format of Digital Television signals that are broadcast
over the air, cable or direct broadcast satellite TV (DBS) systems.
- MPEG-2 is also used to specify the format of movies that are stored on DVDs and
other discs.
- MPEG-2 governs the design of TV stations, TV receivers, DVD players and other
related equipment’s.
- It is second of many MPEG standards and it is an international standard (ISO/IEC
13818).
- Part 1 and 2 or this standard were developed in collaboration with ITU-T and they are
called as H.261 and H.262.
- MPEG is the core of most digital TV and DVD formats. Yet it does not completely
specify them.

Video Compression in MPEG-2:

- The part 2 of MPEG-2 is called as its video section. It is very similar to the previous
MPEG-1 standard but with an additional feature.
- It also provides support for interlaced video.
- MPEG-2 video is not optimized for low bit rates especially less than 1 Mbps at
standard resolutions.
- The MPEG-2 is fully backward compatible with MPEG-1 video format.
- MPEG-2 video is formerly known as ISO/IEC 13818-2 and as ITU-T Rec H.262.
- The enhanced version of MPEG-2 video can be used even for HDTV transmission and
the ATSC digital TV.

Audio Compression in MPEG-2:

MPEG-2 introduces new audio encoding methods compared to MPEG-1

MPEG-2 Part 3:

- The audio section of MPEG-2 is defined in the part 3 of the standard.


- It improves upon the MPEG-1's audio by allowing the coding with more than two
channels upto 5.1 multichannel.

Alipta Anil Pawar


Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
- This method is backward compatible with MPEG-1 audio, therefore it is called as
MPEG-2 BC where BC stands for Backward Compatible.

MPEG-2 Part 7:

- Part 7 of MPEG-2 standard specifies a non-backward compatible audio format. It is


therefore called as MPEG-2 NBC.
- It is also called as MPEG-2 AAC. It is not compatible with MPEG-1. It is more
efficient and less complicated than MPEG-1 audio.
- It can support upto 48 audio channels at the sampling rates of 8 to 96 kHz.

Image Compression in MPEG-2:


MPEG Picture types:
- MPEG-2 takes advantage of high correlation that exists between successive pictures of
a video to compress a series of moving images.
- MPEG constructs three types of pictures namely:

1. Intra pictures (I-pictures)


2. Predicted pictures (P-pictures)
3. Bidirectional predicted pictures (B-pictures)

- In MPEG every Mth picture in a sequence can be fully compressed by using a standard
MPEG algorithm, these are I-pictures.
- Then the successive I-pictures are compared and the portion of the image that have
moved are identified.
- The image sections which do not move are carried forward in time domain to
intermediate pictures by the decoder memory.
- Then a subset of intermediate pictures is selected and the prediction and correction of
locations of the image section which have moved, is carried out.
- These predicted and corrected images are the P-pictures.
- The pictures between I and P-pictures are the B-pictures. They incorporate the
stationary image sections uncovered by the moving sections.
- Fig. 3.11.1 shows the relative position of these pictures.
- The P and B pictures are allowed but they are not required and their number keeps
changing.
- It is possible to form a sequence without P or B pictures but a sequence without I
pictures is not possible.

Alipta Anil Pawar


Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
Compression process:
- The I-pictures are compressed as if they are JPEG images.
- The compressed images are then applied to four 8 x 8 blocks called macro blocks.
- The macro blocks are down sampled for subsequent compression of chrominance
component as shown in Fig. 3.11.2

- The first step in MPEG is to identify the macroblocks moved between the I-pictures.
- Next step is to form the P-frame between the I-pictures. Each macro block is placed at
its predicted location on the P frame and is cross-correlated in its neighbourhood to
determine the true location of the macro block in P-frame.
- The difference between the predicted and true position of the macro block represents
the error in prediction.
- This error is compressed using DCT and used for correcting the P-frame.
- Fig. 3.11.3 shows the macro block shift between I- pictures and an intermediate P-
picture.

Advantages of MPEG-2 Encoding:

1. It can provide compression of video content which cannot be done with MPEG-1
devices.
2. It provides encoding and decoding of audio contents of high quality. (Enhances audio
coding).
3. It can multiplex a variety of MPEG channels into one single transmission stream.

Alipta Anil Pawar


Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
4. It can compress the interlaced video contents as well.
5. It provides the other services including GUI, interaction, encryption and data
transmission.
6. It has a backward compatibility with MPEG-1.
7. MPEG-2 allows extension for specific applications e.g. Interactive TV, Encryption,
Program Guides etc.

Disadvantages:

1. MPEG-2 ties products to specific technology.


2. It reduces the product differentiation.
3. It does not always match user needs.

Features of MPEG-2:

Some of the important features of MPEG-2 are as follows:


1. MPEG-2 was developed and maintained jointly by the ITU-T and ISO/IEC Moving
Picture Expert Group. Its content are exactly identical to the H.262 standard.
2. MPEG-2 has been developed as video compression standard.
3. MPEG-2 is similar to MPEG-1 but it also provides support for the interlaced video used
in NTSC, SECAM and PAL systems.
4. MPEG-2 has not been optimized for low bit rates (lower than 1 Mbps). It is backward
compatible to MPEG-1.
5. MPEG-2 supports a very wide range of applications from mobile to HD editing.
6. PEG-2 has subsets called profiles and levels. An application is allowed to support only
a profile as per its requirement rather than supporting the entire MPEG-2 standard.

Applications of MPEG-2:

Some of the important applications of MPEG-2 are as follows:

1. DVD-video.
2. MPEG-IMX which is a standard definition professional video recording format.
3. High definition video (HDV).
4. XDCAM-a tapeless video recording format.
5. HD-TV.
6. Blue ray disc.
7. Broadcast TV.
8. Digital cable TV.
9. Satellite TV.

Alipta Anil Pawar


Assistant Professor,
Dept. of Electronics and Telecommunication Engineering,
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad

You might also like