0% found this document useful (0 votes)
6 views132 pages

3 Chapter Text and Image Compression

The document discusses data compression, explaining its types—lossless and lossy—and various methods such as Huffman coding, Shannon-Fano coding, and Lempel-Ziv-Welch (LZW) algorithm. It covers the principles of compression, including run-length encoding and transform encoding, as well as the historical development of these techniques. The document emphasizes the importance of redundancy reduction in data representation and the efficiency of different compression algorithms.

Uploaded by

sus22ece
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views132 pages

3 Chapter Text and Image Compression

The document discusses data compression, explaining its types—lossless and lossy—and various methods such as Huffman coding, Shannon-Fano coding, and Lempel-Ziv-Welch (LZW) algorithm. It covers the principles of compression, including run-length encoding and transform encoding, as well as the historical development of these techniques. The document emphasizes the importance of redundancy reduction in data representation and the efficiency of different compression algorithms.

Uploaded by

sus22ece
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 132

Lecture 3

Text and Image Compression


Introduction

Compression: the process of coding that will effectively reduce


the total number of bits needed to represent certain information.
What is Data Compression ?
Data compression requires the identification and extraction of source
redundancy.
In other words, data compression seeks to reduce the number of bits used to
store or transmit information.

There are a wide range of compression methods which can be so unlike one
another that they have little in common except that they compress data.

Data compression can be divided into two main types; lossless and lossy
compression.

Lossless compression can recover the exact original data after


compression.

It is used mainly for compressing database records, spreadsheets or word


processing files, where exact replication of the original is essential.
Lossy compression will result in a certain loss of accuracy in exchange
for a substantial increase in compression.

Lossy compression is more effective when used to compress graphic


images and digitised voice where losses outside visual or aural perception
can be tolerated.

Most lossy compression techniques can be adjusted to different quality


levels, gaining higher accuracy in exchange for less effective
compression.

The amount of compression that can be achieved by a given algorithm


depends on both the amount of redundancy in the source and the
efficiency of its extraction.
A Brief History of Data Compression
The late 40's were the early years of Information Theory, the idea of
developing efficient new coding methods was just starting to be fleshed out.
Ideas of entropy, information content and redundancy were explored. One
popular notion held that if the probability of symbols in a message were known,
there ought to be a way to code the symbols so that the message will take up
less space.
The first well-known method for compressing digital signals is now known as
Shannon-Fano coding. Shannon and Fano [~1948] simultaneously developed
this algorithm which assigns binary codewords to unique symbols that appear
within a given data file. While Shannon-Fano coding was a great leap forward,
it had the unfortunate luck to be quickly superseded by an even more efficient
coding system : Huffman Coding.
Huffman coding [1952] shares most characteristics of Shannon-
Fano coding. Huffman coding could perform effective data
compression by reducing the amount of redundancy in the coding
of symbols. It has been proven to be the most efficient fixed-
length coding method available.

In the last fifteen years, Huffman coding has been replaced by


arithmetic coding. Arithmetic coding bypasses the idea of
replacing an input symbol with a specific code. It replaces a
stream of input symbols with a single floating-point output number.
More bits are needed in the output number for longer, complex
messages.
Dictionary-based compression algorithms use a completely different
method to compress data. They encode variable-length strings of symbols
as single tokens. The token forms an index to a phrase dictionary. If the
tokens are smaller than the phrases, they replace the phrases and
compression occurs.

Two dictionary-based compression techniques called LZ77 and LZ78 have


been developed. LZ77 is a "sliding window" technique in which the
dictionary consists of a set of fixed-length phrases found in a "window" into
the previously seen text. LZ78 takes a completely different approach to
building a dictionary. Instead of using fixed-length phrases from a window
into the text, LZ78 builds phrases up one symbol at a time, adding a new
symbol to an existing phrase when a match occurs.
•Run-length Encoding, or RLE is a technique used to reduce the size of a
repeating string of characters.

•This repeating string is called a run, typically RLE encodes a run of symbols
into two bytes , a count and a symbol.

•RLE can compress any type of data regardless of its information content, but
the content of data to be compressed affects the compression ratio.

•RLE cannot acheive high compression ratios compared to other compression


methods, but it is easy to implement and is quick to execute.

•Run-length encoding is supported by most bitmap file formats such as TIFF,


BMP and PCX.

•Compression is normally measured with the compression ratio :

• Compression Ratio = original size / compressed size : 1

•Consider a character run of 15 'A' characters which normally would require 15


bytes to store :

• AAAAAAAAAAAAAAA
• 15A
•With RLE, this would only require two bytes to store, the count (15)
is stored as the first byte and the symbol (A) as the second byte.

•Consider another example with 16 characters string of :

•000ppppppXXXXaaa
•This string of characters can be compressed to form

• 3(0),6(p),4(X),3(a)
•hence, the 16 byte string would only require 8 bytes of data to
represent the string. In this case, RLE yields a compression ratio of
2:1.
•In run-length encoding, repetitive source such as a string of
numbers can be represented in a compressed form, for example,

• 1,4,5,1,4,5,1,4,5
•can be compressed to form

• 3(1,4,5)
•Thus, giving a compression ratio of = 9/4 : 1 which is almost 2 : 1.
Compression Principles

• By compression the volume of information to be


transmitted can be reduced. At the same time a reduced
bandwidth can be used
• The application of the compression algorithm is the main
function carried out by the encoder and the decompression
algorithm is carried out by the destination decoder
Compression Principles

• Compressions algorithms can be classified as


being either lossless (to reduce the amount of
source information to be transmitted with no loss
of information) – e.g transfer of text file over the
network or
• lossy (reproduced a version perceived by the
recipient as a true copy) – e.g digitized images,
audio and video streams
Entropy Encoding - Run-length encoding -
Lossless
• Examples of run-length encoding are when the source
information comprises long substrings of the same
character or binary digit
• In this the source string is transmitted as a different set
of codewords which indicates only the character but also
the number of bits in the substring
• providing the destination knows the set of codewords
being used, it simply interprets each codeword received
and outputs the appropriate number of characters/bits
e.g. output from a scanner in a Fax Machine
000000011111111110000011 will be represented as
0,7 1,10 0,5 1,2
Entropy Encoding –statistical encoding
• A set of ASCII codewords are often used for
the transmission of strings of characters
• However, the symbols and hence the codewords
in the source information does not occur with the
same frequency. E.g A may occur more
frequently than P which may occur more
frequently than Q
• The statistical coding uses this property by
using a set of variable length codewords – the
shortest being the one representing the most
frequently appearing symbol
Differential encoding
• Uses smaller codewords to represent the difference
signals. Can be lossy or lossless
•This type of coding is used where the amplitude of a
signal covers a large range but the difference between
successive values is small
• Instead of using large codewords a set of smaller code
words representing only the difference in amplitude is
used
• For example if the digitization of the analog signal
requires 12 bits and the difference signal only requires 3
bits then there is a saving of 75% on transmission
bandwidth
Compression Principles

• Transform encoding involves transforming the source


information from one form into another, the other form
lending itself more readily to the application of
compression
Transform Encoding
• As we scan across a set of pixel locations the rate of change
in magnitude will vary from zero if all the pixel values remain
the same to a low rate of change if say one half is different
from the next half, through to a high rate of change if each
pixel changes magnitude from one location to the next
• The rate of change in magnitude as one traverses the matrix
gives rise to a term known as the ‘spatial frequency’
• Hence by identifying and eliminating the higher frequency
components the volume of the information transmitted can be
reduced
Transform coding: DCT transform principles

• Discrete Cosine Transformation is used to transform a


two-dimensional matrix of pixel values into an
equivalent matrix of ‘spatial frequency components
(coefficients)
• At this point any frequency components with
amplitudes below the threshold values can be dropped
(lossy)
The Shannon-Fano Algorithm

This is a basic information theoretic algorithm.

simple example will be used to illustrate the algorithm:

Symbol A B C D E
----------------------------------
Count 15 7 6 6 5

Encoding for the Shannon-Fano Algorithm:


A top-down approach

1. Sort symbols according to their frequencies/probabilities, e.g.,


ABCDE.

2. Recursively divide into two parts, each with approx. same


number of counts
Symbol Count log(1/p) Code Subtotal (# of bits)
------ ----- -------- --------- --------------------
A 15 1.38 00 30
B 7 2.48 01 14
C 6 2.70 10 12
D 6 2.70 110 18
E 5 2.96 111 15
TOTAL (# of bits): 89
Huffman Compression

Huffman compression reduces the average code length used to represent


the symbols of an alphabet.

Symbols of the source alphabet which occur frequently are assigned with
short length codes.

The general strategy is to allow the code length to vary from character to
character and to ensure that the frequently occurring character have shorter
codes.
•Huffman compression is performed by constructing a binary tree using
a simple example set.

•This is done by arranging the symbols of the alphabets in descending


order of probability.

•Then repeatedly adding two lowest probabilties and resorting.

•This process goes on until the sum of probabilities of the last two
symbols is 1.

• Once this process is complete, a Huffman binary tree can be


generated.

•If we do not obtain a probability of 1 in the last two symbols, most


likely there is a mistake in the process.

•This probability of 1 which forms the last symbol is the root of the
binary tree.
•The resultant codewords are then formed by tracing the tree path from
the root node to the endnodes codewords after assigning 0s and 1s to
A step by step worked example in constructing a Huffman binary tree is shown below :

Given a set of symbols with a list of relative probabilities of occurrence within a


message.

m0 m1 m2 m3 m4
0.10 0.36 0.15 0.2 0.19

(1) List symbols in the order of decreasing probability.

m1 m3 m4 m2 m0
0.36 0.20 0.19 0.15 0.10
(2) Get two symbols with lowest probability. Give the combined symbol a
new name.

m2 m0 A
Combines to
0.15 0.10 form 0.25

(3) The new list obtained is shown below. Repeating the previous step will give us a
new symbol for the next two lowest probabilities.

m1 A m3 m4
0.36 0.25 0.20 0.19

m3 m4
Combines to B
0.20 0.19 form 0.39
(4) A new list is obtained. Repeating the previous step will give us a new symbol for
the following two lowest probabilities.

B m1 A
0.39 0.36 0.25

m1 A Combines to C

0.36 0.25 form 0.61

5) Finally there is only one pair left and we simply combine them and name
them as a new symbol.

B C Combines to D
0.39 0.61 form 1.0
Having finished these steps we have :
7) Now, a Huffman tree can be constructed, 0's and 1's are assigned to the
branches.
8)The resultant codewords are formed by tracing the tree path from the root
node to the codeword leaf.

Symbols Probabilities Codewords


m0 0.10 011
m1 0.36 00
m2 0.15 010
m3 0.20 10
m4 0.19 11
Another Example: Adaptive Human Coding
This is to clearly illustrate more implementation
details. We show exactly what bits are sent, as
opposed to simply stating how the tree is updated.
An additional rule: if any character/symbol is to be
sent the first time, it must be preceded by a
special symbol, NEW.

The initial code for NEW is 0. The count for NEW


is always kept as 0 (the count is never increased);
hence it is always denoted as NEW:(0) in Fig. 7.7.
Table 7.3: Initial code assignment for AADCCDD using
adaptive Human coding.

Initial Code
NEW: 0
A: 00001
B: 00010
C: 00011
D: 00100
..
..
..
Text Compression – Flow chart of a
suitable decoding algorithm

• Decoding of received
bitstream assuming
codewords derived:
decoding algorithm
Text Compression – Example

• The algorithm assumes a table of


codewords is available at the receiver and
this also holds the corresponding ASCII
codeword
Text Compression – Lampel-Ziv coding
• The LZ algorithm uses strings of characters instead
of single characters
• For example for text transfer, a table containing all
possible character strings are present in the encoder
and the decoder
• As each word appears instead of sending the ASCII
code, the encoder sends only the index of the word in
the table
• This index value will be used by the decoder to
reconstruct the text into its original form. This
algorithm is also known as a dictionary-based
compression
Text Compression – LZW Compression

• The principle of the Lempel-Ziv-Welsh coding


algorithm is for the encoder and decoder to build the
contents of the dictionary dynamically as the text is
being transferred
• Initially the decoder has only the character set – e.g
ASCII. The remaining entries in the dictionary are built
dynamically by the encoder and decoder
Text Compression – LZW coding
• Initially the encoder sends the index of the four characters T,
H, I, S and sends the space character which will be detected as
a non alphanumeric character
• It therefore transmits the character using its index as before
but in addition interprets it as terminating the first word and
this will be stored in the next free location in the dictionary
• Similar procedure is followed by both the encoder and
decoder
• In applications with 128 characters initially the dictionary
will start with 8 bits and 256 entries 128 for the characters and
the rest 128 for the words
Text Compression – LZW Compression
Algorithm

• A key issue in determining the level of compression


that is achieved, is the number of entries in the
dictionary since this determines the number of bits
that are required for the index
DICTIONARY BASED CODING
Lempel-Ziv-Welch (LZW) ALGORITHM is adaptive dictionary based
Compression technique

Uses fixed length codewords

LZW encoder and decoder builds the same dictionary

LZW proceeds by placing longer and longer repeated entries into


dictionary

If the element is already placed then it emits the code rather than the
string
LZW compression works best for files containing lots of repetitive data.

This is often the case with text and monochrome images.

Files that are compressed but that do not contain any repetitive information at
all can even grow bigger!

LZW compression is fast.

LZW compression can be used in a variety of file formats:


TIFF files
GIF files
•Lempel-Ziv-Welch (LZW) Algorithm
•The LZW algorithm is a very common compression technique.

•Problems:
•Too many bits,
•everyone needs a dictionary,
•only works for English text.

•Solution: Find a way to build the dictionary adaptively.


•Original methods due to Ziv and Lempel in 1977 and 1978. Terry Welch
improved the scheme in 1984 (called LZW compression).

•It is used in UNIX compress -- 1D token stream

•It used in GIF comprerssion -- 2D window tokens


The LZW Compression Algorithm can summarised as follows:
w = NIL;
while ( read a character k )
{
if wk exists in the dictionary
w = wk;
else
add wk to the dictionary;
output the code for w;
w = k;
}
Example:
Input string is "^WED^WE^WEE^WEB^WET".

w k output index symbol


-----------------------------------------
NIL ^
^ W ^ 256 ^W
W E W 257 WE
E D E 258 ED
D ^ D 259 D^
^ W
^W E 256 260 ^WE
E ^ E 261 E^
^ W
^W E
^WE E 260 262 ^WEE
E ^
E^ W 261 263 E^W
W E
WE B 257 264 WEB
B ^ B 265 B^
^ W
^W E
^WE T 260 266 ^WET
T EOF T
The LZW Decompression Algorithm is as follows:
read a character k;
output k;
w = k;
while ( read a character k )
/* k could be a character or a code. */
{
entry = dictionary entry for k;
output entry;
add w + entry[0] to dictionary;
w = entry;
}
Input string is "^WED<256>E<260><261><257>B<260>T".

w k output index symbol

-------------------------------------------
^ ^
^ W W 256 ^W
W E E 257 WE
E D D 258 ED
D <256> ^W 259 D^
<256> E E 260 ^WE
E <260> ^WE 261 E^
<260> <261> E^ 262 ^WEE
<261> <257> WE 263 E^W
<257> B B 264 WEB
B <260> ^WE 265 B^
<260> T T 266 ^WET
Image Compression – GIF compression
Principles

• The graphics interchange format is used extensively


with the Internet for the representation and
compression of graphical images
Image Compression – GIF
• Although colour images comprising 24-bit pixels
are supported GIF reduces the number of possible
colours that are present by choosing 256 entries from
the original set of 224 colours that match closely to
the original image
• Hence instead of sending as 24-bit colour values
only 8-bit index to the table entry that contains the
closest match to the original is sent.This results in a
3:1 compression ratio
• The contents of the table are sent in addition to the
screen size and aspect ratio information
•The image can also be transferred over the network
using the interlaced mode
Image Compression – GIF Compression –
Dynamic mode using LZW coding

• The LZW can be used to obtain further levels of


compression
Image Compression – GIF interlaced mode

1/8 and 1/8


of the total
compressed
image

• GIF also allows an image to be stored and


subsequently transferred over the network in an
interlaced mode; useful over either low bit rate
channels or the Internet which provides a variable
transmission rate
Image Compression – GIF interlaced mode

Further ¼
and
remaining ½
of the image

• The compression image data is organized so that the


decompressed image is built up in a progressive way
as the data arrives
Digitized Documents
• Since FAX machines are used with public carrier
networks, the ITU-T has produced standards relating
to them
• These are T2(Group1), T3 (Group2), T4 (Group3)
(PSTN), and T6 (Group 4) (ISDN)
• Both use data compression ratio in the range of 10:1
• The resulting codewords are grouped into
termination-codes table (white or black run-lengths
from 0 to 63 pels in steps of 1) and the make-up
codes table (contains in multiples of 64 pels)
• Since this codeword uses two sets of codeword it is
known as the modified Huffman codes
Image Compression – GIF interlaced mode

ITU –T Group 3 and 4


facsimile conversion codes:
termination-codes

Termination code table



Image Compression – GIF interlaced mode

• ITU –T Group 3 and


4 facsimile conversion
codes: make-up codes
Make-up of 64
codewords
• Each scanned line is terminated with an EOL code.
In this way the receiver fails to decode a word it
starts to search for an EOL pattern
• If it fails to decode an EOL after a preset number of
lines it aborts the reception process and informs the
sending machine
• A single EOL precedes the end of each scanned line
and six consecutive EOLs indicate the end of each
page
• The T4 coding is known as one-dimensional coding
MMR coding (2 dimensional coding)
• The modified-modified relative element address
designate coding explores the fact that most scanned
lines differ from the previous line by only a few pels
• E.g. if a line contains a black-run then the next line
will normally contain the same run pels plus or minus
3 pels
• In MMR the run-lengths associated with a line are
identified by comparing the line contents, known as
the coding line (CL), relative to the immediately
preceding line known as the reference line (RL)
• The run lengths associated with a coding line are
classified into three groups relative to the reference
line
Image Compression – run-length
possibilities: pass mode (a), vertical mode

Pass mode

• This is the case when the run-length in the reference


line(b1b2) is to the left of the next run-length in the
coding line (a1a2), that is b2 is to the left of a1
Vertical mode

• This is the case when the run-length in the


reference line (b1b2) overlaps the next run-length in
the coding line(a1a2) by a maximum of plus or minus
3 pels
Image Compression – run-length
possibilities: Horizontal mode

• This is the case when the run-length in the


reference line (b1b2) overlaps the run-length (a1a2) by
more than plus or minus 3 pels
Image Compression – JPEG encoder
schematic

• The Joint Photographic Experts Group forms the


basis of most video compression algorithms
Image Compression – Image/block
preparation
• Source image is made up of one or more 2-D matrices of
values
• 2-D matrix is required to store the required set of 8-bit
grey-level values that represent the image
• For the colour image if a CLUT is used then a single
matrix of values is required
• If the image is represented in R, G, B format then three
matrices are required
• If the Y, Cr, Cb format is used then the matrix size for the
chrominance components is smaller than the Y matrix
( Reduced representation)
Image Compression – Image/block
preparation

•Once the image format is selected then the values in each


matrix are compressed separately using the DCT
• In order to make the transformation more efficient a
second step known as block preparation is carried out
before DCT
• In block preparation each global matrix is divided into a
set of smaller 8X8 submatrices (block) which are fed
sequentially to the DCT
Image Compression – Image Preparation

• Once the source image format has been selected and


prepared (four alternative forms of representation),
the set values in each matrix are compressed
separately using the DCT)
Image Compression – Image Preparation

• Block preparation is necessary since computing the


transformed value for each position in a matrix
requires the values in all the locations to be processed
Image Compression – Forward DCT
• Each pixel value is quantized using 8 bits which produces
a value in the range 0 to 255 for the R, G, B or Y and a
value in the range –128 to 127 for the two chrominance
values Cb and Cr
• If the input matrix is P[x,y] and the transformed matrix is
F[i,j] then the DCT for the 8X8 block is computed using
the expression:
1 7 7 (2 x  1)i (2 y  1) j
F [i, j ]  C (i)C ( j )  P[ x, y ] cos cos
4 x 0 y 0
16 16
Image Compression – Forward DCT
• All 64 values in the input matrix P[x,y] contribute to
each entry in the transformed matrix F[i,j]
• For i = j = 0 the two cosine terms are 0 and hence the
value in the location F[0,0] of the transformed matrix is
simply a function of the summation of all the values in the
input matrix
• This is the mean of all 64 values in the matrix and is
known as the DC coefficient
• Since the values in all the other locations of the
transformed matrix have a frequency coefficient associated
with them they are known as AC coefficients
Image Compression – Forward DCT
• for j = 0 only the horizontal frequency coefficients are
present
• for i = 0 only the vertical frequency components are
present
• For all the other locations both the horizontal and vertical
frequency coefficients are present
Image Compression – Image Preparation

• The values are first centred around zero by


substracting 128 from each intensity/luminance value
Image Compression – Quantization
• Using DCT there is very little loss of information during the
DCT phase
• The losses are due to the use of fixed point arithmetic
• The main source of information loss occurs during the
quantization and entropy encoding stages where the
compression takes place
• The human eye responds primarily to the DC coefficient and
the lower frequency coefficients (The higher frequency
coefficients below a certain threshold will not be detected by
the human eye)
• This property is exploited by dropping the spatial frequency
coefficients in the transformed matrix (dropped coefficients
cannot be retrieved during decoding)
Image Compression – Quantization
• In addition to classifying the spatial frequency
components the quantization process aims to reduce the size
of the DC and AC coefficients so that less bandwidth is
required for their transmission (by using a divisor)
• The sensitivity of the eye varies with spatial frequency
and hence the amplitude threshold below which the eye will
detect a particular frequency also varies
• The threshold values vary for each of the 64 DCT
coefficients and these are held in a 2-D matrix known as the
quantization table with the threshold value to be used with
a particular DCT coefficient in the corresponding position
in the matrix
Image Compression – Quantization
• The choice of threshold value is a compromise between
the level of compression that is required and the resulting
amount of information loss that is acceptable
• JPEG standard has two quantization tables for the
luminance and the chrominance coefficients. However,
customized tables are allowed and can be sent with the
compressed image
Image Compression – Example
computation of a set of quantized DCT
coefficients
Image Compression – Quantization
• From the quantization table and the DCT and quantization
coefficents number of observations can be made:
- The computation of the quantized coefficients involves
rounding the quotients to the nearest integer value
- The threshold values used increase in magnitude with
increasing spatial frequency
- The DC coefficient in the transformed matrix is largest
- Many of the higher frequency coefficients are zero
Image Compression – Entropy Encoding
• Entropy encoding consists of four stages
Vectoring – The entropy encoding operates on a one-
dimensional string of values (vector). However the output
of the quantization is a 2-D matrix and hence this has to
be represented in a 1-D form. This is known as vectoring
Differential encoding – In this section only the difference
in magnitude of the DC coefficient in a quantized block
relative to the value in the preceding block is encoded.
This will reduce the number of bits required to encode the
relatively large magnitude
The difference values are then encoded in the form (SSS,
value) SSS indicates the number of bits needed and actual
bits that represent the value
e.g: if the sequence of DC coefficients in consecutive
quantized blocks was: 12, 13, 11, 11, 10, --- the difference
values will be 12, 1, -2, 0, -1
Image Compression – Vectoring using Zig-
Zag scan

• In order to exploit the presence of the large number


of zeros in the quantized matrix, a zig-zag of the
matrix is used
Image Compression – run length encoding
• The remaining 63 values in the vector are the AC coefficients
• Because of the large number of 0’s in the AC coefficients they
are encoded as string of pairs of values
• Each pair is made up of (skip, value) where skip is the
number of zeros in the run and value is the next non-zero
coefficient

• The above will be encoded as


(0,6) (0,7) (0,3)(0,3)(0,3) (0,2)(0,2)(0,2)(0,2)(0,0)
Final pair indicates the end of the string for this block
Image Compression – Huffman encoding
• Significant levels of compression can be obtained by
replacing long strings of binary digits by a string of much
shorter codewords
• The length of each codeword is a function of its relative
frequency of occurrence
• Normally, a table of codewords is used with the set of
codewords precomputed using the Huffman coding
algorithm
Image Compression – Frame Building
• In order for the remote computer to interpret all the
different fields and tables that make up the bitstream it is
necessary to delimit each field and set of table values in a
defined way
• The JPEG standard includes a definition of the structure of
the total bitstream relating to a particular image/picture.
This is known as a frame
• The role of the frame builder is to encapsulate all the
information relating to an encoded image/picture
Image Compression – JPEG encoder
Image Compression – Frame Building
• At the top level the complete frame-plus-header is
encapsulated between a start-of-frame and an end-of-frame
delimiter which allows the receiver to determine the start
and end of all the information relating to a complete image
• The frame header contains a number of fields
- the overall width and height of the image in pixels
- the number and type of components (CLUT, R/G/B,
Y/Cb/Cr)
- the digitization format used (4:2:2, 4:2:0 etc.)
Image Compression – Frame Building
• At the next level a frame consists of a number of
components each of which is known as a scan
The level two header contains fields that include:
- the identity of the components
- the number of bits used to digitize each component
- the quantization table of values that have been used to
encode each component
• Each scan comprises one or more segments each of which
can contain a group of (8X8) blocks preceded by a header
• This contains the set of Huffman codewords for each
block
Image Compression – JPEG decoder

• A JPEG decoder is made up of a number of stages


which are simply the corresponding decoder sections
of those used in the encoder
JPEG decoding
• The JPEG decoder is made up of a number of stages
which are the corresponding decoder sections of those used
in the encoder
• The frame decoder first identifies the encoded bitstream
and its associated control information and tables within the
various headers
• It then loads the contents of each table into the related
table and passes the control information to the image
builder
• Then the Huffman decoder carries out the decompression
operation using preloaded or the default tables of
codewords
JPEG decoding
• The two decompressed streams containing the DC and AC
coefficients of each block are then passed to the differential
and run-length decoders
• The resulting matrix of values is then dequantized using
either the default or the preloaded values in the quantization
table
• Each resulting block of 8X8 spatial frequency coefficient
is passed in turn to the inverse DCT which in turn
transforms it back to their spatial form
• The image builder then reconstructs the image from these
blocks using the control information passed to it by the
frame decoder
JPEG Summary
• Although complex using JPEG compression ratios of 20:1
can be obtained while still retaining a good quality image
• This level (20:1) is applied for images with few colour
transitions
• For more complicated images compression ratios of 10:1
are more common
• Like GIF images it is possible to encode and rebuild the
image in a progressive manner. This can be achieved by
two different modes – progressive mode and hierarchical
mode
JPEG Summary
• Progressive mode – First the DC and low-frequency
coefficients of each block are sent and then the high-
frequency coefficients
• hierarchial mode – in this mode, the total image is first
sent using a low resolution – e.g 320 X 240 and then at a
higher resolution 640 X 480
Ex 3.1
Ex 3.2
Ex 3.3
Ex 3.4
Ex 3.6
Ex 3.7
Ex 3.9

You might also like