Unit 4 Updated DIP
Unit 4 Updated DIP
REDUNDANT
DATA
INFORMATION
JPEG – best for photographs or designs with people, places or things in them
PNG – best for images with transparent backgrounds
GIF – best for animated GIFs, otherwise, use the JPG format
Vector vs. Raster
• Raster Image Files
• Raster images are constructed by a series of pixels, or individual blocks, to form
an image.
• JPEG, GIF, and PNG are all raster image extensions.
• Every photo we find online or in print is a raster image.
• Pixels have a defined proportion based on their resolution (high or low), and
when the pixels are stretched to fill space they were not originally intended to fit,
they become distorted, resulting in blurry or unclear images.
• In order to retain pixel quality, we cannot resize raster images without
compromising their resolution.
• As a result, it is important to remember to save raster files at the exact
dimensions needed for the application.
• Vector Image Files
• Vector images are far more flexible.
• They are constructed using proportional formulas rather than pixels.
• EPS, AI and PDF are perfect for creating graphics that require frequent resizing.
• Your logo and brand graphics should have been created as a vector, and you should
always have a master file on hand.
• The real beauty of vectors lies in their ability to be sized as small as a postage
stamp, or large enough to fit on an 18-wheeler.
Types of Digital image files
• There are five main formats:
1. TIFF (also known as TIF), file types ending in .tif
• TIFF stands for Tagged Image File Format. TIFF images create very large file
sizes. TIFF images are uncompressed and thus contain a lot of detailed image
data (which is why the files are so big).
• TIFFs are also extremely flexible in terms of color (they can be grayscale, or
CMYK for print, or RGB for web) and content (layers, image tags).
• TIFF is the most common file type used in photo software (such as Photoshop),
as well as page layout software (such as Quark and InDesign), again because a
TIFF contains a lot of image data.
2. JPEG (also known as JPG), file types ending in .jpg
• JPEG stands for Joint Photographic Experts Group, which created this standard
for this type of image formatting.
• JPEG files are images that have been compressed to store a lot of information in
a small-size file.
• Most digital cameras store photos in JPEG format, because then you can take
more photos on one camera card than you can with other formats.
• A JPEG is compressed in a way that loses some of the image detail during the
compression in order to make the file small (and thus called “lossy”
compression).
• JPEG files are usually used for photographs on the web, because they create a
small file that is easily loaded on a web page and also looks good.
• JPEG files are bad for line drawings or logos or graphics, as the compression
makes them look “bitmappy” (jagged lines instead of straight ones).
3. GIF, file types ending in .gif
• GIF stands for Graphic Interchange Format.
• This format compresses images but, as different from JPEG, the compression
is lossless (no detail is lost in the compression, but the file can’t be made as
small as a JPEG).
• GIFs also have an extremely limited color range suitable for the web but not
for printing.
• This format is never used for photography, because of the limited number of
colors. GIFs can also be used for animations.
• Zonal coding: Transform coefficients with large variance are retained. Same set
of coefficients retained in all subimages.
3/29/2025 50
JPEG Schematic
Colour De
quantization
Inverse
Transform -
DCT
JPEG Compression
Decompression
JPEG
Down- Decoding Up-
sampling sampling
Compressed Colour
Forward
image data Transform
DCT
Quantization Encoding
Fact about JPEG Compression
3/29/2025 52
Algorithm
• Splitting: Split the image into 8 x 8 non-overlapping pixel blocks. If the
image cannot be divided into 8-by-8 blocks, then you can add in empty
pixels around the edges, essentially zero-padding the image.
• Colour Space Transform from [R,G,B] to [Y,Cb,Cr] & Subsampling.
• DCT: Take the Discrete Cosine Transform (DCT) of each 8-by-8 block.
• Quantization: quantize the DCT coefficients according to psycho-visually
tuned quantization tables.
• Serialization: by zigzag scanning pattern to exploit redundancy.
• Vectoring: Differential Pulse Code Modulation (DPCM) on DC
components
• Run Length Encoding (RLE) on AC components
• Entropy Coding:
– Run Length Coding
– Huffman Coding or Arithmetic Coding
Step I - Splitting
The input image is divided into smaller blocks having 8 x 8 dimensions,
summing up to 64 units in total. Each of these units is called a pixel, which is the
smallest unit of any image.
sampling
factor (1, 1, 1)
8x8 pixel
1 pixel = 3 components
4 blocks
16 x16 pixel
MCU with
sampling
factor
(2, 1, 1)
• The DCT uses the cosine function, therefore not interacting with complex
numbers at all.
• DCT converts the information contained in a block(8x8) of pixels from
spatial domain to the frequency domain.
Why DCT?
• Human vision is insensitive to high frequency components, due to which it
is possible to treat the data corresponding to high frequencies as redundant.
To segregate the raw image data on the basis of frequency, it needs to be
converted into frequency domain, which is the primary function of DCT.
DCT
Formula
• 1-D DCT
–
• But the image matrix is a 2-D matrix. So we can either apply 1-D DCT to
the image matrix twice. Once row-wise, then column wise, to get the DCT
coefficients. Or we can apply the standard 2-D DCT formula for JPEG
compression. If the input matrix is P(x,y) and the transformed matrix is
F(u,v) or G(u,v) then the DCT for the 8 x 8 block is computed using the
expression:-
• 2-D DCT –
DC and AC components
(8x8 (8x8
) )
2-D
F(0,0) is called the DC component
DCT
and the rest of P(x,y) are called AC
components.
P(x,y) F(u,v)
• For u = v = 0 the two cosine terms are 0 and hence the value in the location F[0,0]
of the transformed matrix is simply a function of the summation of all the values in
the input matrix.
• This is the mean of all 64 values in the matrix and is known as the DC coefficient.
• Since the values in all the other locations of the transformed matrix have a
frequency coefficient associated with them they are known as AC coefficients.
Step IV - Quantization
• Humans are unable to see aspects of an image that are at really high
frequencies. Since taking the DCT allows us to isolate where these high
frequencies are, we can take advantage of this in choosing which values to
preserve. By multiplying the DCT matrix by some mask, we can zero out
elements of the matrix, thereby freeing the memory that had been representing
those values.
• The resultant quantize matrix will only preserve values at the lowest
frequencies up to a certain point.
• Why Quantization?
– To reduce the number of bits per sample.
• Two types:
– Uniform quantization
• q(u,v) is a constant
– Non-uniform quantization
• Custom quantization tables can be put in image/scan header.
• JPEG Standard defines two default quantization tables, one each for
luminance and chrominance.
Quantization
𝐅(𝐮,𝐯)
Standard 𝑭 𝐮,𝒗 = 𝒓𝒐𝒖𝒏𝒅
𝐐(𝐮,𝐯)
Formula:
• The entries of Q(u,v) tend to have larger values towards the lower right corner. This
aims to introduce more loss at the higher spatial frequencies.
• The tables above show the default Q(u,v) values obtained from psychophysical studies
with the goal of maximizing the compression ratio while minimizing perceptual losses
in JPEG images.
Quantization - Example
160 80 47 39 5 3 0 0 16 7 4 2 0 0 0 0
65 53 48 8 5 2 0 0 5 4 3 0 0 0 0 0
58 34 6 4 2 0 0 0 4 2 0 0 0 0 0 0
40 7 6 1 1 0 0 0 0 0 0 0 0 0 0 0
8 4 0 0 0 0 0 0 Quantizer 0 0 0 0 0 0 0 0
5 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
45 7
(1x64)
(1x64)
34 -11
. (1x64) . (1x64)
. .
. .
29 -5
(1x64)
(1x64)
Step VII - RLE on AC
components
• Run Length Encoding (RLE) is applied to the AC components.
• 1x63 vector (AC) has lots of zeros in it.
• Encoded as (skip, value) pairs, where skip is the number of zeros preceding
a non-zero value in the quantized matrix, and value is the actual coded
value of the non-zero component.
• (0,0) is sent as end-of-block (EOB) sentinel value and the zeros after that
are discarded. Only the number of zeros is taken note of.
… 0 0 0 6 0 0 0 0 0 1 …
…. (1x64)
(3,6) (5,1)
RLE Examples
Example 1-
Example 2-
The 1x64 vector is reduced in terms of bits by grouping the elements as groups of
(repeat count,pixel value).
Step VIII – Huffman Coding
• DC and AC components finally need to be represented by a smaller
number of bits
• Why Huffman Coding?
– Significant levels of compression can be obtained by replacing long strings of binary digits
by a string of much shorter code words.
• How?
– The length of each code word is a function of its relative frequency of occurrence.
• Normally a table of code words is used with the set of code words pre-
computed using the Huffman Coding Algorithm.
• In Huffman Coding, each DPCM-coded DC coefficient is represented by a
pair of symbols : (Size, Amplitude)
• where Size indicates number of bits needed to represent the coefficient and
Amplitude contains actual bits.
Huffman Coding: DC
Components
• DC Components are coded as (Size,Value). The look-up table for generating
code words for Value is as given below:-
• Works with colour and grayscale images, but not with binary images.
• Up to 24 bit colour images (Unlike GIF)
• Target photographic quality images (Unlike GIF)
• Suitable for many applications e.g., satellite, medical, general
photography, etc.