0% found this document useful (0 votes)
13 views101 pages

Chapter 8 - Image Compression

Chapter 8 of 'Digital Image Processing' focuses on image compression, detailing its fundamentals, methods, and applications. It discusses the importance of reducing data size for efficient storage and transmission, particularly in digital media like movies and images. The chapter also introduces digital image watermarking and explores various redundancies that can be exploited to achieve compression.

Uploaded by

Nidaa Flaih
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views101 pages

Chapter 8 - Image Compression

Chapter 8 of 'Digital Image Processing' focuses on image compression, detailing its fundamentals, methods, and applications. It discusses the importance of reducing data size for efficient storage and transmission, particularly in digital media like movies and images. The chapter also introduces digital image watermarking and explores various redundancies that can be exploited to achieve compression.

Uploaded by

Nidaa Flaih
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 101

DIGITAL IMAGE

PROCESSING
Chapter 8 – Image Compression

Instructor:
Dr J. Shanbehzadeh
[email protected]
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

8.1 Fundamentals
Table of Contents

8.3 Digital Image Watermarking


8.2 som basic compression method

2
8.1 Fundamentals
8.1.1 coding redundancy
8.1.2 spatial and temporal redundancy
8.1.3 irrelevant information
8.1.4 measuring imageinformation
8.1.5 fidelity Criteria
8.1.6 Image compression Models
8.1.7 Image Formats , Containers , and compression
standard
3
Image Compression
 Preview
Image compression , the art and science of reducing the amount of data reguired to 
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

represent an image ,is one of the most useful and commerically successful
technologies in the filed of digital image processing .the number of imaged that are
compress and decompressed daily is staggering,and the compressions and
decompressions themselves are virtually invisible to the user .anyone who owns a
digital camera ,surfs the web ,or wathes the latest hollywood movies on digital video
.disks (DVDs)benefits from the algorithms and standards discussed in this chapter
To better understand the need for compact image representations,consider the amount
of data reguired to represent a two – hour standard definiton(SD)television movie
using 720 480 24 bit pixel arrays.Adigitsl movies (or video)is sequence of video
farmes in which each frame is a full –color still image .Becase video players must
display the frames sequentially at rates near 30 fps (Frames per second).SD digital
video data must be accessed at

And a two –hour movie consists of

4
Image Compression
Or 224 GB (gigabytes)of data .twenty –seven 8.5 GB dual-layer DVDs (assuming conventional 12 cm 
disks)are needed to store it.to put a two-hour movies on a single DVD.each frame must be compressed – on
average – by a factor of 26.3 the compression must be even higher for high definition (HD) television.where
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

.image resolutions reach 1920 1080 24 bits /image


Web page images and high –resolution digital camera photos also are com-pressed routinly to save storage 
soace and reduce transmission time .for example ,residential internet connections deliver data at speeds
ranging from 56 kbps (kilobits per second)via conventional phone lines tomore than 12 mbpsfh (megabits
per second) for broadband.the time reguired to transmit a small 128 128 24 bit full-color image over this
range of speed is from 7.0 to 0.03 seconds.compression can reduce trasmission time by a factor of 2 to 10
or more.in the same way,the number of uncompressed full-color images that an 8-megapixel digital camera
can store on 1- GB flash memory card (about forty-one 24 MB (megabyte)images)can be similary
incresed.in addition to these applications .image compression plays an important role in many other
areas,including televideo conferencing,remote sensing.document and medical imaging,and facsimile
transmission (FAX).An incresing number of applications depend on the efficient manipulation,storage ,and
.transmission of binary,gray-scale,and color images
In this chapter,we introduce the theory and practice of digital image compression.weexaminethemostfrequently
used compression techniques and describe the industry standards that make them useful.the material is
introducetory in natural and applicable to both still image and video applications.the chapter concludes whit
an introduction to digital image watermarking,the process of inserting visible and invisible data (like
copyright ingormation) into images.

5
8.1 FUNDAMENTALS
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

The term data compression refers to the process of reducing the amount of data reguired to represent a
given quantity of information.in this definition,data and information are not the same thing ,data
are the means by which information is conveyed.Because various amounts of data can be used to
represent the same amount of information,representations that contain irrelevant or repeated
information are said to contain redundant data,if we let b and b denote the number of bits (or
information –carrying units)in two representations of the same information,the relative data
redundancy R of the representation whit b bits is

Where C ,commonly called the Compression ratio ,is defined as

6
8.1 fundamental
If C=10 (Sometimes written 10:1 )for instance ,the larger representation has 10 bits 
. of data for every 1 bit of data in the smaller representation
Thecorresponding relative data redundancy of the larger representation is 0.9 
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

.(R=0.9),indicating that 90 of its data is redundant


In the context of digital image compression ,b in Eq.(8.1-2)usually is the number of 
bits needed to represent an image as a 2- D array of intensity values the 2-D
intensity arrays introduced in section 2.4.2 are the perferred formats for human
viewing and interpretation-and the standard by which all other representations are
judge.when it comes to compact image representation,however,these formats are far
from optimal.two-dimensional intensity arrays suffer from three principal types of
.data redundancies that can identified and exploited
Conding redundancy.A code is a system of symbols (letters.numbers.bits,and the .1 
like)
Used to represent a body of information or set of events.Each piece of information or 
event is assigned asequence of code symbols,called a code word.the number of
symbols in each code word is its length.the 8 bit codes that are used to represent the
intensities in most 2-D intensity arrays contain more bits than are needed to
.represent the intensities
spatial and temporal redundancy .Because the pixels of most 2-D intensity arrays .2 
are correlated spatially (i.e,each pixel is similar to or dependent on neighboring
pixels)information is unnecessarily replicated in the repre sentation of the correlated
pixels.in a video sequence,temporally correlated pixels (i.e,those similar to or
.dependent on pixels in nearrby frames )also duplicate information
Irrelevant information .most 2-D intensity arrays contain information that is.3 
7
ignored by the human visual system and /or extraneous to the intended use of the
The computer-generated images in fig.8.1(a)through (C)exhibit each of these
fundamental redundancies. As will be seen in the next three sections,copmression
is achieved when one or more redundancy is reduced or eliminated.

8
8.1.1 Coding Redundancy
In chapter3,we developed techniques for image enhancement by histogram
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

processing,assuming that the intensity values of an image are random quanti ties.In
this section ,we use a similar formulation to introduce optimal information
coding.Assume that a discrete random variable rk in the interval [0,L -1] is used to
represent the intensities of an M N image and that each rk occurs whith
probabilitypr (rk).As in section.3.3

Where L is the number of intensity values,and nk is the number of times that the kth
intensity appears in the image.if the number of bits used to represent each value of r k
is (rk),then the average number of bits required to represent each pixel is.

9
8.1.1 Coding Redundancy -2
That is ,the average length of the code words assigned to the various intensity
values is found by summing the products of the number of bits used to
represent each intensity and the probability that the intensity occurs.the total
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

number of bits required to represent an M×N image is MNLaveg.if the


intensities are represented using a natural m-bit fixed –length code the right-
hand side of Eq.(8.1-4)reduces to m bits.that is,Lavg­= m when m is substituted
for l(rk).the constant m can be taken outside the summation,leaving only the
sum of the pr (rk) for 0≤ k≤ L-1,which, of course ,equals 1
□the computer-generated image in fig.8.1 (a) has the intensity distribution show
in the second column of table 8.1.if a natural 8-bit binary code (denoted as
code 1 in table 8.1) is used to represent its 4 possible intensities ,L avag – the
average number of bits for code 1-is8 bits,because l1(rk)= 8bits for all rk.

10
On the other hand,if the scheme designated as code 2 in table 8.1 is used,the average length of the encoded pixels
is,in accordance whit Eq .(8.1-4)

The total number of bits needed to represent the entire image is MNL avg =256×256×1.81 or 118,621.from Eqs.(8.1-
2)and (8.1-1),the resulting compression and corresponding relative redundancy are.

and
8.1.1 Coding Redundancy - 4
Respectively.thus 77.4% of the data in the original 8-bit 2-D intensity array is 
redundant. The compression achieved by code 2 results from assigning fewer bits to
themore probable intensity values than to the less probable ones.in the resulting
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

variable –length code,r128-the images most probable intensity –is assignd the 1-bit
code word 1 [of length l2(r128)=1],while r255-its least probable occurring intensity –is
assigned the 3-bit code word 001 [of length L2
3] note that the best fixed-length code that can be assignd to the intensities of =)r255( 
the image in fig.8.1(a) is the natural 2-bit counting seguence {00,01,10,11},but the
resulting compression is only 8/2 or 4:1 –about 10% less than the 4.42:1 compression
.of the variable-length code
As the preceding axample shows,coding redundancy is present when the codes 
assigned to a set of events (such as intensity values)do not take full advantage of the
probabilities of the events.Conding redundancy is almost always present when the
intensities of an image are represented using a natural binary code.the reason is that
most images are composed of objects that have a regular and somewhat predictable
morphology (shape)and reflectance,and are sampled so that the objects being
depicated are much larger than the picture elements.the natural consequence is
that,for most images,certain intensities are more probably than others(that is,the
histograms of most images are not uniform).A natural binary encoding assigns the
same number of bits to both the most and least probable values .failing to minimize
.Eq.(8.1-4) and resulting in coding redundancy
8.1.2 spatial and temporal redundancy
Consider the computer-generated collection of constant intensity lines in fig.8.1(b).in
the corresponding 2-D intensity array:
1 . All 256 intensities are equally probably.As fig.8.2 shows,the histogram of the image
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

is uniform.

2.because the intensity of each line was selected randomly ,its pixels are independent of
one another in the vertical direction.
3.because the pixels along each line are identical ,they are maximally correlated
(completedly dependent on one another) in the horizontal direction
The first observation tells us that the image in fig.8.1(b)-when represented as a
conventional 8-bit intensity array – cannot be compressed by variable length coding
alone.unlike the image on fig.8.1 (a) (and Example 8.1),whose historgram was not
uniform ,a fixed –length 8-bit code in this case minimizes Eq.(8.1-4).observations 2
and 3 reveal a significant spatial redundancy that can be eliminated.for instance,by
representing the image in fig.8.1(b) as a sequence of run-length pairs,where each run
–length pair specifies the start of a new intensity and the number of consecutive
pixels that have that intensity Arun –length based representation compresses the
original 2-D,8- bit intensity array by (256×256×8)/[(256+256)×8] or 128:1.Each 256-13
8.1.2 spatial and temporal redundancy - 2
• In most images,pixels are correlated spatially (in both x and y )and in time (when
the image is part of a video sequence).because most pixel intensities can be
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

predicted reasonably well from neighboring intensisties,the information carried


by a single pixel is small.much of its visual contribution is redundant in the sence
that it can be inferred from its neighbors.to reduce the redundancy associated
with spatially and temporally correlated pixels,a 2-D intensity array must be
transformed into a more efficient but usully “non-visual” representation.for
example,run –lengths or the differences between adjacent pixels can be
used .transformations of this type are called mappings.A mapping is said to be
reversible if the pixels of the original 2-D intensity array can be reconstructed
without error from the transformed data set;otherwise the mapping is said to be
irreversible
8.1.3 Irrelevant Information
 One of the simples ways to compress a set of data is to remove superfluous data from
the set. tn the context of digital image compression,information that is ignored by
the human visual system or is extraneous to the intended use of an image are
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

obvious candidates for omission.thus,the computer-generated image in


fig .8.1(c) ,because it appears tobe a homogeneous field of gray,can be represented
by its average intensity alone-asingle 8-bit value.the original 256×256×8 bit intensity
array is reduced to a single byte;and the resulting compression is (256×256×8)/8 or
65,536:1.of course,the original 256×256×8 bit image must be recreated to view and /
or analyze it-but there would be little or no percieved decrease in reconstructed
image quality.
 Figure 8.3 (a)shows the histogram of the image in fig.8.1(C).note that there are
several intensity values (intensisties 125 through 131)actually present.the human
visual system average these intensity that are present in this case.figure 8.3 (b),a
histogram equalized version of the image in fig.8.1(c),makes the intensity changes
visible and reveals two previously undetected regions of constant intensity-one
oriented vertically and the other horizontally.if the image in fig.8.1 (c) is represented
vertically and the other horizonatally.if the image in fig .8.1(c) is represented by its
average value alone ,this “invisible” structure (i.e,the constant intensity regions) and
the random intensity variations surrounding them – real information – is lost.whether
or not this information should be preserved is application dependent.if the
information is important,as it might be in a medical application (like digital X-ray
archival),it should not be omitted otherwise,the information is redundant and can be
excluded for the sake of compression perfprmance.

15
8.1.3 Irrelevant Information - 2
We conclude the section by noting that the redundancy examined here is
fundamentally different from the redundancies discussed in section 8.1.1 and
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

8.1.2 its elimination is possible because theinformation itself is not essential for
normal visual processing and/or the intended use of the image.because its
omission results in a loss of quantitative information,its removal is commonly
referred to as quantization.this terminology is consistent with normal use of the
word,which generally means the mapping of a broad rang of input values to a
limited number of output values (see section 2.4)because information is
lost,quantization is an irreversible operation.

16
8.1.4 : Measuring Image Information
In the previous section ,we introduced several ways to reduce the amount of data used
to represent an image.the question that naturally arises is this :how few bits aer
actually needed to represent the information inan image?
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

That is ,is there a minimum amount of data that is sufficient to describe an image
without losing information ?information theory provides the mathematical framework
to answer this and related question .its fundamental premise is that the generation of
information can be modeled as a probabilistic process that can be measured in a
manner that agrees with intuition.in accordance with this suposition ,a random event
E with probability p (E) is said to contain.
8.1.4 : Measuring Image Information - 2
 Units of information .if p(E) =1 (that is ,the event always occurs),I (E) =0 and no
information is attributed to it .because no uncertainly is associated with the event ,no
information would be transferre by communicating that the event has occurred [it
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

always occurs if p (E)=1]


 The base of the logarithm in Eq . (8.1-5) determines the unit used to measure
information .if the base m logarithm is used ,the measurement is said bit.note that if
That is ,1 bit is the amount of information conveyed when one of two possible equally 
likely events occurs.A simple example is flipping a coin and communication the
.result
 Given a source of statistically independent random events from a discrete set of
possible events {a1 , a2­,…..aj}whit associate probabilities{p(a1),p(a2),….p(aj)} ,the
average information per source output,called the entropy of the source ,is

 The aj in this equation are called source symbols .because the are statisically
independent ,the source symbole itself is called a zero-memory source .if an image is
considered to be the output of an imaginary zero zero-memory “intensity source”we
can use the histogram of the observed image to estimate the symbol probabilities of
the source.then the intensity source’s entropy becomes.

18
8.1.4 : Measuring Image Information - 3
 Where variables L,rk , and pr (rk) are as defnied in section 8.1.1 and 3.3. because the
base 2 logarithm is used,Eq.(8.1-7) is the average information per intensity output of
the imaginary intensity source in bits .it is not possible to code the intensity values of
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

the imaginary source (and thus the sample images)with fewer than H bits/pixel.
 □the entropy of the image in fig.8.1(a)can be estimated by substituting the intensity
probabilities from table 8.1 into Eq .(8.1-7)

 In a similar manner ,the entropies of the images in fig .8.1(b)and (c) can be shown to
be 8 bits/pixel and 1.566 bits /pixel ,respectively .note that the image in fig .8.1(a)
appears to have the most visual information ,but has almost the lowest computed
entropy – 1.66 bits/pixel .the image in fig .8.1(b) has almost five times the entropy of
the image in (a),but appears to have about the same (or less)visual information ;and
the image in fig .8.1(c),which seems to have little or no information ,has almost the
same entropy as the image in (a).the obvious conclusion is that the amount of
entropy and thus information in an image is far from intuitive.

19
Shannon’s first theorem
Recall that the variable-length code in Example 8.1 was able represent the intensities of
the image in fig.8.1(a)using only 1.81bits/pixel .Although this is higher than 1.6614
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

bits/pixel entropy estimate from example 8.2 shannons first theorem –also called the
noiseless coding theorem (shannon[1984])-assures us that the image in fig.8.1(a)can
be represented with as few as 1.6614 bits/pixel. To prove it in a general
way ,shannon looked at representing groups of n consecutive source symbols with a
single code word (rather than one code word per source symbol)and showed that.

20
Shannon’s first theorem - 2
Where Lavg,n is the average number of code symbols reguired to represent all n- 
symbol group .in the proof ,he defined the nth extension of a zero-memory source to
be the hypothetical source that produce n-symbol blocks using the symbols of the
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

original source ;and computed Lavg,n by applying Eq .(8.1-8)to the code words used to
represent the n-symbol blocks.Equation (9.1-8) tells us that Lavg,n/n can be made
aribitrarily close to H by encoding infinitely long extensions of the single –symbol
source.that is,it is possible to represent the output of a zero –memory source with an
.average of H information units per source symbol
If we now return to the idea that an images is a “sample”of the intensity source that 
produced it ,a block of n source symbols corresponds to a group of n adjacent
pixels.to construct a variable-length code for n- pixel blocks, the relative frequencies
of the blocks must be computed.but the nth extension of a hypothetical intensity
source with 256 intensity values has 256­possible n-pixel blocks .Even in the simple
case of n = 2,a 65,536 element histogram and up to 65,536 variable – length code
words must be generated for n =3,as many as 16,777,216 code words are needed .so
even for small values of n, computational complexity limits the usefulness of the
.extension coding approach in practice
 Finally ,we note that although Eq.(8.1-7) provides a lower bound on the compression
that can be achieved when coding statistically independent pixels directly ,it breaks
down when the pixels of an image are correlated.blocks of correlated pixels can be
coded with fewer average bits per pixel than the equation predicts.Rather than
using source extensions,less correlated descriptors (like intensity run –lengths) are
normally selected and coded without extension .this was the approach used to
compress fig.8.1(b) in section 8.1.2.when the output of a source of information
depends on a finite number of preceding outputs ,the source is called a markov or 21
finite memory source.
8.1.5 Fidelity
In section 8.1.3.it was noted that the removal of irrelevant visual”information 
involves a loss of real or quantitative image information.because information is lost ,a
means of quantifying the nature of the loss is needed .two types of criteria can be
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

used for such an assessment 1) objective fidelity criterian and (2)subjective fidelity
.criteria
 When information loss can be expressed as a mathematical function of the input and
output of a compression process,it is said to be based on an objective fidelity
criterion.An example is the root –mean –square(rms)error between two images.let f
(x,y) be an input image and f (x,y) be an approximation of f (x,y) that results from
compressing and subsequenty decompressing the input.for any value of x and y ,the
error e (x,y) between f (x,y) and f (x,y ) is

 So that the total error between the two images is

 Where the images are of size M×N. the root –mean-square error ,e rms,between f (x,y)
and f (x,y) is then the square root of the squared error averaged over the M×N
array ,or

22
8.1.5 Fidelity – 2
If f (x,y)is considered [by a simple rearrangement of the terms in Eq.(8.1-9)]to be the
sum of the original image f (x,y) and an error or “noise”signal e (x,y),the mean-
square signal – to noise ratio of the output image ,denoted SNRms ,can be defined as
in section 5.8:

 The rms value of the signal – to – noise ratio,denoted SNRrms ,is obtained by taking
the square root of Eq.(8.1-11)
 While objective fidelity criteria offer a simple and convenient way to evaluate
information loss ,decompressed images are ultimately viewed by
humans,someasuring image quality by the subjective evaluations of people is often
more appropriate.this can be done by presenting a decompressed image to a cross
section of viewers and averaging their evaluations .the evaluations may be made
using an absolute rating scale or by means of side-by-side comparisons of f (x,y) and f
(x,y).table 8.2 shows one possible absolute rating scale .sid-by –side comparisons can
be done with a scale such as {-3,-2,-1,0,1,2,3,} to represent the subjective
evaluations {much worse,worse,slightly worse,the same,slightly better ,better,much
better},respectively.in either case ,the evaluations are based on subjective fidelity
criteria.
23
8.1.5 Fidelity - 3
□figure 8.4 shows three different approximation of the image in fig .8.1(a).using Eq.
(8.1-10)with fig .8.1(a)for f(x,y) and the images in fig .8.4(a) through (c) as f(x,y),the
computed rms errors are 5.17,15.67,and 14.17 intensity levels,respectively.in terms
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

of rms error-an objective fidelity criterion the three images in fig .8.4 are ranked in
order of decreasing quality as {(a),(c),(b)}.

24
8.1.5 Fidelity - 4
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

 Figures 8.4(a) and(b) are typical of images that have been compressed and
subsequently reconstructed.both retain the essential information of the original
image-like the spetial and intensity characteristics of its objects.
 And their rms errors correspond roughly to perceived quality.figure 8.4(a),which is
practically as good as the original image,has the lowest rms error,while fig.8.4(b) has
more error but noticeable degradation at the bound aries between objects.this is
exactly as one would expect.
 Figure 8.4 (c) is an artificially generated image that demonstrates the limitation of
objective fidelity criteria.note that the image is missing large sections of several
important lines(i.e,visual information) and has small dark squares (i.e.,artifacts) in
the upper right quadrant.the visual content of the image is misleading and certainly
not as accurate as the image in (b) ,but it has less rms error-14.17 versus 15.76
intensity values.A subjective evaluation of the three images using table 8.2 might
yield an excellent rating for(a) a passable or marginal rating for (b) and an inferior of
unusable rating for (c) the rms error measure ,on the other hand,ranks (c) ahead 25
8.1.6 : Image Compression Model
As fig 8.5 shows ,an image compression system is composed of two distinct functional
components:an encoder and a decoder.the encoder performs compression ,and the
decoder performs the complementary operation of decompression .both operations
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

can be performed in software ,as is the case in web browsers and many commercial
image editing programs,or ina combination of hardware and firmware,as in
commercial DVD players.A codec is a device or program image f (x,…) is fed into the
encoder,which creats a compressed representation of the input.this representation is
stored for later use,or trans mitted for storage and use at a remot location.when the
compressed representation is presented to its complementary decoder ,a
reconstructed output image f (x,…) is generated.in still-image applications,the
encoded input and decoder output are f (x,y) and f (x,y) ,respectively ; in video
applications ,they

Are f (x,y,t) and f(x,y,t) ,where discrete parameter t specifics time.in general ,f (x,…)may
or may not be an exact replica of (x,…) if it is ,the compression system is called error
free,lossless,or information perserving.if not,the reconstructed output image is
distorted and the compression system is referred to as lossy.
26
The encoding or compresion process
 The encoder of fig .8.5 is designd to remove the redundancies described in sections
8.1-8.1.3 through a series of three independent operations.in the first stage of the
encoding process ,a mapper transforms f (x,…) into a (usually non visual) format
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

designd to reduce spatial and temportal redundancy .this operation generally is


reversible and may or may not reduce directly the amount of data required to
represent the image.run –length coding (see section 8.1.2 and 8.2.5) is an example
process.the mapping that normally yields compression in the first step of the
encoding process.the mapping of an image into aset of less corrlated transform
cofficients (see section 8.2.8)is an example of the opposite case (the coefficients must
be further processed to achieve compression).in video application ,the mapper uses
previous (and in some cases future) video frames to facilitate the removal of temporal
redundancy.
 The quantizer in fig.8.5 reduces the accuracy of the mappers output in accordance
with a pre-established fidelity criterion .the goal is to keep irrelevant information out
of the compressed representation ,As noted in section 8.1.3.this operation is
irreversible .it must be omitted when error-free compression is desired.in video
applications,the bit rate of the encoded output is often measured (in bits/second) and
used to adjust the operation of the quantizer so that a predetermined average output
rate is maintained.thus,the visual quality of the output can vary from frame to frame
as a function of image content.
 In the third and final stage of the encoding process,the symbol coder of fig .8.5
generates a fixed –or variable-length code to represent the quantizer output and
maps the output in accordance with the code.in many cases,a variable-length code is
used.the shortest code words are assigned to the most frequently occurring
quantizer output values-thus minimizing coding redundancy .this operation is
reversible .upon its completion ,the input image has been processed for the removal 27
The decoding or decompression process
The decoder of fig .8.5 contains only two components: a
symbol decoder and an inversemapper .they perform ,in
reverse order ,the inverse operation of the encoders symbol
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

encoder and mapper .because quantization results in


irreversible information loss,an inverse quantizer block is
not included in the general decoder model.in video
applications,dacoded output frames are maintained in an
internal frame store (not shown) and used to reinsert the
temporal redundancy that was removed at the encoder.

28
8.1.7 Image Formats , Containers , and Compression
Standard
 In the context of digital imagine,an image file format is a stsndard way to organize
and store image data.it defines how the data is arranged and the type of
compression- if any –that is used .An image container is similar to a file format but
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

handles multiple types of image data.image compression standards,on the other


hand,define procedures for data needed to represent an image.these standards are
the underpinning of the widespread of image compression technology.
 Figure8.6 lists the most important image compression standards,file formats,and
containers in used today,group by the type of image handled.the entries in black are
international standards sanctioned by the international standards Organization (ISO),
the international Electrotechnical Commission (IEC),and /or the international
telecommunications union(ITU-T)- a united nations (UN) organization that was once
called the consultative committee of the international telephone and
telegraph(CCITT)two video compression standards ,VC-1 by the society of motion
pictures and television Engineers (SMPTE) and AVS by the chinese ministry of
information industry (MII) ,are

29
 Also included .Note that they
are shown in gray ,which is
used in fig .8.6 to denote
entries that are not
sanctioned by an
international standards
organization.table 8.3 and
8.4 summarize the
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

standards,applications,and
key compression methods
are identified .the
compression methods
themselves are the subject of
section of the next section.in
both tables ,forward
refrences to the relevant
subsections of sections 8.2
are enclosed in square
brackets.

30
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

31
32
Problems
Problem 8.1
Problem 8.2
Problem 8.3
Problem 8.4
Problem 8.5

33
8.2 Some Basic Compression Methods
 8.2.1 Huffman Coding
 8.2.2 Golomb Coding
 8.2.3 Arithmatic Coding
 8.2.4 LZW Coding
 8.2.5 Run-Length Coding
 8.2.6 Symbol – Based Coding
 8.2.7 Bit-Plane Coding
 8.2.8 Block Transform Coding
 8.2.9 Predictive Coding
 8.2.10 Wavelet Coding

34
8.2 Some Basic Compression Methods

In this section ,we describe the principal lossy and


R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

error- free compression methods that have


proven useful in main stream binary ,continuous –
tone still images, and video compression
standards the standards themselves are used to
demonstrate the mothods presented.

35
8.2.1 Huffman Coding
 One of the most popular techniques for removing coding reundancy is due
to huffman (huffman [1952]).when coding the symbols of an information
source individually ,huffman coding yields the smallest possible number of
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

code symbols per source symbol.in terms of shannon s first theorem (see
section 8.1.4),the resulting code id optimal for a fixed value of n , subject
to the constraint that the source symbols be coded one at a time. In
practice,the source symbols may be either the intensities of an image or
the output of an intensity mapping operation (pixel differences, run
lengths,and so on)
 The first step in huffmans approach is to create a series of source
reductions by ordering the probabilities of the symbols under
consideration and combining the lowest probability symbols into asingle
symbol that replacess them in the next source reduction.fingure 8.7
illustrates this process for binary coding (K-ary huffman codes can also be
constructed).At the far left,a hypothetical set of source symbols and their
probabilities are orderd from top to bottom in terms of decreasing
probability values .to form the first source reduction ,the bottom two
probabilities,0.06 and 0.04 are combined to form a “compound symbol”
with probability 0.1.this compound symbol and its associated probability
are placed in the first source reduction column so that the probabilities of
the reduced source also are ordered from the most to the least
probable.this process is then repeated unit a reduce source with two
36
symbols(at the far right) is reached.
8.2.1 Huffman Coding
The second step in huffmans procedure is to code each reduced source,starting with the
smallest source and working back to the original source.the miniral length binary
code for a two –symbol source,of course are the symbols 0 and 1 As fig .8.8
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

shows,these symbols are assigned to the two symbols on the right (the assignment is
arbitarary;reversing the order of the 0 and 1 would work just as well ).As the reduced
source symbol with probability 0.6 was generated by combining two symbols in the
reduced source to its left,the 0 used to code it isnow assigned to both of these
symbols , and a 0 and 1 are arbitrarily appended to each to distinguish them from
each other.this operation is then repeated for

37
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

 Each reduced source until the original source is reached.the final code appears at
the far left in fig .8.8.the average length of this code is
 Lavg = (0.4)(1)+(0.3)(2)+(0.1)(3)+(0.1)(4)+(0.06)(5)+(0.04)(5)=2.2 bits/pixel
 And the entropy of the source is 2.14 bits/symbol.
 Huffmans procedure creats the optimal code for a set of symbols and probabilities
subject to the constraint that the symbols be coded one at a time After the code has
been created, coding and/or error-free decoding is accomplished in a simple lookup
table manner.the code itself is an instantaneous uniquely decodable block code.it is
called a block code because each source symbol is mapped into a fixed sequence of
code symbols.it is instantaneous because each code word in a string of code symbols
can be decoded without referencing succeeding symbols.it is uniguely decodable
because any string of code symbols can be decoded in only one way.thus,any string of
huffman encoded symbols can be decoded by examinig the individual symbols of the
string in a left-to right manner.for the binary code of fig.8.8,a left –to-right scan of
the encoded string 010100111100 reveals that the first valid code word is 01010,with
is the code for symbol a3 .the next valid code is 011 ,which corresponds to symbol
a1.continuing in this manner reveals the completely decoded message to be a 3a1a2a2a6 38
The 512×512×8 bit monochrome image in fig.8.9(a) has the intensity histogram shown
in fig.8.9 (b) because the intensities are not equally probable,
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

A MATLAB implementation of huffmans procedure was used to encode them with7.428


bts/pixel – including the huffman code table that is reguired to reconstruct the
original 8-bit image intensities.the compressed representation exceeds the
estimated entropy of the image [7.3838 bits/pixel from Eq.(8.1-7)]by 512×(7.428-
7.3838)or 11,587 bits –about 0.6% the resulting compression ratio and corresponding
relative redundancy are c =8/7.428 = 1.077 and R =1-(1/1.077) =
0.0715,respectively .thus 7.15 % of the original 8-bit fixed – length intensity
representation was removed as coding redundancy.
 When a large number of symbols is to be coded ,the construction of an optimal
huffman code is a nontrivial task. For the general case of j source symbols j symbol
probabilities j-2source reductions,and j -2 code assignments are reguired. When
source symbol probabilities can be estimated in advance “near optimal” coding can
be achived with pre-computed huffman codes.several popular image compression
standards,including the JPEG and MPEG standards discussed in section 8.2.8 and
8.2.9,sepcify default huffman coding tables that have been pre-computed based on
experimental data.

39
8.2.2 Golomb Coding
 In this section we consider the coding of nonnegative integer inputs with exponentially decaying probability
distributions.inputs of this type can be optimally encoded (in the sense of shannons first theorem)using a
family of codes that are computationally simpler than huffman codes.the codes them selves were first
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

proposed for the representation of nonnegative run lengths (Golomb [1966]).in the disscussion that
follows,the notation [x] denotes the largest integer less than or equal to x,[x] means the smallest intenger
grater than or equal to x and x mod y is the remainder of x divided by y.
 Given a nonnegative integer n and a positive intenger divisor m >0,the golomb code of quotient[n/m] and
the binary representation of remainder n mod m.G m(n) is constructed as follows:

Step 1. Form the unary code of quotient [n/m] .(the unary code of an integer q is defined as q 1s
followed by a 0 .)
Step 2 . let k = [log m] ,c =2k –m , r =n mod m , and compute truncated remainder r
2

such that

Step 3. Concatenate the results of steps 1 and 2.


 To compute G4(9) ,for example ,begin by determining the unary code of the quotient [9/4]=[2.25] = 2 ,
which is 110 (the result of step 1 ) .then let k = [log 2 4] = 2 , c = 22 - 4=0 And r = 9 mod 4,which in binary
is 1001 mod0100 or 0001.in accordance with Eq.(8.2-1) ,r id then r (i.e.0001) truncated to 2 bits, which is
01 (the result of step 2 ) .finally , concatenate 110 from step 1 and 01 from step 2 to get 11001 ,which is
G4(9)
40
 For the special case of m = 2k , c = 0 and r1 = r = n mod m truncated to k bits in Eq .(8.2-1) for all n , the
divisions required to generate the resulting Golomb codes become binary shift operations and the
computationally simpler codes are called Golom-Rice or Rice codes (Rice [1975]).columns 2,3,and 4 of
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

table 8.5 list the G1,G2,and G4 codes of the first ten nonnegative integers.Because m is a power of 2 in each
case (i.e.,1=20,2=21 , and 4=22), they are the first three Golomb –Rice codes as well .Moreover ,G 1 is the
unary code of the nonnegative integers because [n/1] =n and n mod 1 =0 for all n.
 Keeping in mind that Golomb codes can only be used to represent nonnegative integers and that there are
many Golomb codes to choose from ,a key step in their effective application is the selection of divisor
m.when the integers to be represented are geometrically distributed with probability mass function (PMF)

 FOR SOME 0<1 , Golomb codes can be shown to be optimal – in the sense that G m (n) provides the shortest
averge code length of all uniqualy decipher able codes –when (Gallager and voorhis [1975])

41
 Figure 8.10 (a) plots Eq .(8.2-2) for three values of ρ and illustrates graphically the
symbol probabilities that Golomb codes handle well (that is ,code efficiently )As is
shown in the figure ,small integers are much more probable than large ones.
 Because the probabilities of the intensities in an image [see,for example ,the
histogram of fig .8.9 (b) ] are unlikely to much the probabilities specified in Eq.(8.2-2)
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

and shown in fig .8.10 (a) ,Golomb codes are seldom used for the coding of
intensities .when intensity differences are to be coded ,however ,the

42
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

 Probabilities of the resulting “difference values”(see section 8.2.9)-with the notable exception of the
negative differences – often resemble those of Eq.(8.2-2) and fig .8.10 (a).to handle negative differences in
Golomb coding.which can only represent nonnegative integers ,a mapping like.

43
 typically is used .using this mapping ,for example,the two –sided PMF shown in fig .8.10 (b) can be
transformed into the one –sided PMF in fig .8.10 (c) .its intergers are reodered .alternating the negative and
positive integers so that the negative integers are mapped into the odd positive integer positions.if p (n) is
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

two – sided and centered at zero , p (m (n)) will be one – sided .the mapped integers , M(n) , can then be
efficiently encoded using an appropriate Golomb – Rice code (Weinberger et al .[1996])
 □Consider again the image from fig 8.1 (c) and note that its histogram – see fig .8.3 (a) – is similar to the
two – sided distribution in fig .8.10 (b) above .if we let n be some nonnegative integer intensity in the
image ,where 0 ≤ n ≤ 255 ,and μ be the mean intensity ,p (n-μ) is the two - sided distribution shown in
fig .8.11 (a).this plot was generated by normalizing the histogram in fig .8.3 (a) by the total number of
pixels in the image and shifting the normalized values to the left by 128 (which in effect subtracts the mean
intensity from the image ) .in accordance with Eq.(8.2-4),p (M (n –μ )) is then the one – sided distribution
shown in fig .8.11(b).if the recorded intensity values are Golomb coded using a MATLAB implementation
of code G1 in column 2 of table 8.5 , the encoded representation is 4.5 times smaller than the original image
(i.e., c = 4.5 ).the G1 code realizes 4.5/5.1 or 88% of the theoretical compression possible whit variable –
length coding .(Based on the entropy calculated in Example 8.2 , the maximum possible compression ratio
through variable – length coding is C=8/1.566≈5.1.) moreover ,Golomb coding achieves 96% of the
compression provided by a MATLAB implementation of Huffmans approach – and doesn’t reguire the
computation of a custom Huffman coding table.
 Now consider the image in fig .8.9 (a) .if its intensities are Golomb coded using the same G 1 code as
above ,C = 0.0922 . that is ,there is data expansion.

44
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

 This is due to the fact that the probabilities of the intensities of the image in fig 8.9(a) are much different
than the probabilities defined in Eq.(8.2-2).in a similar manner,huffman codes can produce data expansion
when used to encode symbols whose probabilities are different from those for which the cod was
computed.in practice ,the futher you depart from the input probability assumtions for which a code is
designd,the greater the risk of poor compression performance and data expansion.

45
 To conclude our coverage of Golomb codes,we note that column 5 of table 8.5 contains the first 10 codes of
the zeroth order exponential Golomb code,denoted G exp (n). Exponential-Golomb codes are useful for the
encoding of run lengths ,because both short and long runs are encoded efficiently An order-k exponential-
Golomb code G exp (n) is computed as follows:

Step 1. Find an integer I ≥ 0 such that


R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

 And form the unary code of i.if k =0,I = [ log 2(n+1)and the code is also known as the Elias gramma code.
 truncate the binary representation of
Step 2 .

To k + i least significant bits.


Step 3. Concatenate the results of step 1 and 2
to find Gexp (8) ,for example ,we let i = [log2 9] or 3 in step 1 because k = 0 Equation (8.2-5)is then satisfied
because.

46
 the unary code of 3 is 1110 and Eq.(8.2-6) of step 2 yields
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

 Which when truncated to its 3+0 least significant bits becomes 001.the concatenation of the results from
steps 1 and 2 then yields 1110001.Note that this is the entry in column 4 of table 8.5 for n =8 .finally ,we
note that like huffman codes of the last section ,the Golomb codes of table 8.5 are variable-
length,instantaneous uniquely decodable block codes.

47
8.2.3 Arithmatic Coding
 Unlike the variable – length codes of the previous two sections,arithmetic coding generates nonblock

codes.in arithmetic coding ,which can be traced to the work of Elias (see Abramson [1963]) a one – to- one
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

correspondence between source symbols and cod words does not exist.instead ,an entire sequence of source
symbols (or message) is assignd a single arithmetic code word .the code word itself defines an interval of
real numbers between 0 and 1 .As the number of symbols in the message increases,the interval used to
represent it becomes smaller and the number of information units(say,bits) required to represent the invertal
becomes larger .Each symbol of the message reduces the size of the interval in accordance with its
probability of occurrence .because the technique does not require number of code symbols(that is,that the
symbols be coded one at a time),it achieves (but only in theory ) the bound established by shannons first
theorem of section 8.1.4
 Figur 8.12 illustrates the basic arithmetic coding process.here,a five – symbol seguence or message ,a 1a2a3a4

from a four-symbol source is coded.At the start of the coding process,the message is assumed to occupy the
entire half-open interval [0,1).As table 8.6 shows ,this interval is subdivided initially into four regions based
on the probabilities of each source symbol .symbol a 1,for example ,is associated with subinterval
[0,0.2).because it is the first symbol of the message being coded,the massage interval is initially narrowed to
[0,02).thus in fig.8.12
48
49
 [0,0.2)is expanded to the full height of the figure and its end points labeled by the values of the narrowed
range.the narrowed rang is then subdivided in accordance with the original source symbol probabilities and
the process continues con-tinues with the original source symbol probabilities and the process com-tinues
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

with the original source symbol probabilities and the process com-tinues with the next massage symbol . in
this manner,symbol a2 narrows the subinterval to [0.04,0.08),a 3 further narrows it to [0.056,0.072),and so on
.the final message symbol,which must be reserved as special end-of-message in-dicator,narrows the rang
to[0.06752,0.0688).Of course,any number within this subinterval-for example,0.068-can be used to
represent the massage .in the arithmetically-coded massage of Fig.8.12,three decimal digits are used to
represent the five-symbol massage.This translates into 0.6 decimal digits per source symbol and compares
favorably with the entropy of the source,which, .from
 Eq .(8.1-6),is 0.58 decimal digits per source symbol.As the lenghth of the se-quence being coded
increases,the resulting arithmetic code approaches the bound established by shannon’s
 First theorem.In practice,two factors cause cod –ing performance to fall short of the bound:(1)the addition
of the end-of-message indicator that is needed to separate one massage from another,and (2)the use of finite
precision.practical implementations of arithmetic coding address the latter problem by introducing a scaling
strategy and a rounding start-egy (Langdon and Rissanen [1981].The scaling strategy renormalizes each
subinterval to the[0,1] rang before subdividing it in accordance with the symbol probabilities.The rounding
strategy guarantees that the truncations associated with finite precision arithmetic do not precision arithmeic
do not prevent the coding subimtervals from being represented accurately.

50
Adaptive Context dependent probability

 With accurate input symbol probability models,that is,models that provide the true probabilities of the

symbols being coded,arithmetic coders are near opti-mal in the sense of minimizing the average number of
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

code symbols required to represent the symbols being coded.Like in both Huffman and Golomb cod-
ing,however,inaccurate probability models can lead to non-optimal results.A simple way to improve the
accuracy of the probabilities employed is to use an adaptive ,context dependent probability model.adaptive
probability models update symbol probabilities as symbols are coded or become known.Thus,the
probabilities adapt to the local statistics of the symbols being coded.context dependent models probabilities
that are based on a predefinded neigh-borhood of pixels-called the context-around the symbols being
coded.Nor-mally,a causal context-one limited to symbols that have already been coded-is used.Both the Q-
coder(pennebaker et al.[1988])and MQ-coder (ISO/IEC[2000]),two well-known arithmetic coding
techniques that have been incorpotated into the JBIG,JPEG-2000,and other important image compression
standards,use probability models that are adaptive and con-text dependent.The Q-coder dynamically
updates symbol probabilities during the interval renormalizations that are part of the arithmetic coding
process.Adaptive context dependent models also have been used in Golomb coding-for example,in the
JPEG-LS compression standard.

51
 Figure 8.13(a)diagrams the steps involved in adaptive,context –dependet arithmetic coding of binary source
symbols.Arithmetic coding often is used when binary symbols are to be coded. As each symbol (or bit)
begans the coding process,its context is formed in the context determination block of Fig.8.13(a) Figures
8.13 (b)through(d)show three possible contexts that can be used:(1)the immediately preceding symbol,(2)a
group of preceding symbols,and (3)some number of preceding simbols plus symbols on the previous scan
line.For the three cases shown,the probability estimation block must manage 2,(or 2), (or256),and
(or32)contexts and their associated probabilities.For instance,if the context in the context in Fig.8.13(b)is
used,conditional probabilities
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

P(0|a=0)(the probability that the symbol being coded is a 0 given that the preceding symbol
is a 0 ),P(1|a+0),p(0|1=1,and p(1|a=1)must be tracked.The appropriate probbabilities are then passed to the
Arithmetic coding block as afunction of the current context and drive the generation of the arith-metically
coded output sequence in accordance with the process illustrated in Fig.8.12.The probabilities associated
with the context involved in the current coding step are then updated to reflect the fact that another symbol
within that context has been processed.
 Finally we note that a variety of arithmetic coding techniques are protected by United States patents(and
may in addition be protected in other jurisdic-tions).Because of these patents and the possibility of
unfavorable monetary judgments for their infringement,most implementions of the JPEG com pression
standard,which contains for both Huffman and arithmetic coding,typically support Huffman coding alone.

52
8.2.4 LZW Coding
 The techniques covered in the previous sections are focused on the removal of coding redundancy.In this
section,we consider an error-free compression approach that also addresses spatial redundancies in an
image.The technique,called Lempel-Ziv-Welch (LZW)coding ,assigns fixed-length code words to variable
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

length sequences of source symbols.Recall from section 8.1.4 that shannon used the idea of source
symbols.Recall from section 8.1.4 that shannon used the idea of coding sequences of source symbols,rather
than in-dividual source symbols,in the proof of his first theorem.A key feature of LZW coding is that it
requires no a priori knowledge of the probability of occurrence of the symbols to be encoded.Despite the
fact that until recently it was protected under a United states patent,LZW compression has been integrated
into a vari-ety of mainstream imaging file formats ,including GIF,TIFF,and PDF .The PNG format was
created to get around LZW licensing requirements.
 Consider again the 512×512,8-bit image from fig .8.9(a).Using Adobe photoshop,an uncompressed TIFF
version of this image requires 286,740 bytes of disk space -262,144 bytes for the 512×512 8-bit pixels plus
24,596bytes of overhead.Using TIFF’s LZW compression option ,however,the re-sulting file is 224,420
bytes.The compression ratio is c=1.28.Recall that for the Huffman encoded representation of figh .8.9(a)in
Example 8.4,c=1.077.
 The additional compression realized by the LZW approach is due the removal of some of the image’s spatial
redundancy.

53
 LZW coding is conceptually very simple (Wrlch [1984]).At the onset of the coding process,a codebook or
dictionary containing the source symbols to be coded is constructed.for 8-bit monochrome images,the first
256 words of the dictionary are assigned to intensities 0,1,2,…,255.As the encoder sequen-tially examines
image pixels,intensity sequences that are not in the dictionary are placed in algorithmically determined
(e.g.,the next unused) locations .If the first two pixsels of the image are white,for instance,sequence “255-
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

255”might be assigned to location 256,the address following the locations reserved for intensity levels 0
through 255.the next time that two consecutive white pixels are encountered ,code word 256,the address of
the location containing sequence 255-255,is used to represent them.if a 9-bit ,512-word dictionary is
employed in the coding process,the original (8+8)bits that were used to rep-resent the two pixels are
replaced by a single 9-bit code word.clearly ,the size of the dictionary is an important system parameter.if it
is too small ,the detec-tion of matching intensity-level sequences will be less likely;if it is too large,the size
of the code words will adversely affect compression performance.
 Consider the following4×4,8-bit image of a vertical edge:

54
 Table 8.7 details the steps involved in coding its 16 pixels.A 512-word dictio-nary with the following
starting content is assumed:
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

 Locations 256 through 511 initially are unused.

55
 The image is encoded by processing its pixels in a left –to-right ,top-to-bottom manner.Each successive
intensity value is concatenated with a variable-column 1 of table8.7-called the “currently recognized
sequence.”As can be seen ,this variable is initially null or empty. The dictionary is searched for each
con-catenated sequence and if found .as was the case in the first row of the table,is replaced by the newly
concatenated and recognized (i.e.,located in the dictionary)sequence.This was done in column 1 of row2.No
output codes are generated,nor is the dictionary altered.If the concatenated sequence is not found,
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

however,the address of the currently recognized sequence is output as the next encoded value,the
concatenated but unrecognized sequence is added to the dictionary,and the currently recognized sequence is
initialized to the current pixel value.This occurred in row 2of the table.The last two columns detail the
intensity se-quences that are added to the dictionary when scanning the entire 4×4 image.Nine additional
code words are difinded.At the conclusion of coding,the dictionary contains265code words and the LZW
algorithm has success-fully identified several repeating intensity sequences-leveraging them to reduce the
original 128-bit imaget to 90 bits.(i.e.,10 9-bit codes).The encoded output is optained by reading the third
column from top to bottom.The resulting com-pression ratio is 1.42:1

56
 A unique feature of the LZW coding just demonstrated is that the coding dictionary or code book is created

while the data are being encoded.Rmark-ably,an LZW decoder builds an identical decompression dictionary
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

as it de-codes simoltaneously the encoded data stream.It is left as an exercise to the reader(see problem
8.20)to decode the output of the preciding example and reconstruct the code book.Although not needed in
this example,most practi-cal applications require a strategy for handling dictionary overflow.A simple
solution is to flush or reiitialize the dictionary when it becomes full and con-tinue coding with a new
initialized dictionary.A more complex option is to monitor compression performance and flush the
dictionary when it becoms poor or unacceptable.Alternatively,the least used dictionary entries can be
tracked and replaced when necessary.

57
8.2.5 Run-Length Coding
 As was noted in Section 8.12,images with repeating intensities along their rows(or columns) can often be
compressed by representing runs of identical intensities as run-length pairs,where each run-length pair
specifies the start of a new intensityt and the number of consecutive pixels that have that intensi-ty The
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

technique,referred to as run-length encoding (RLE),was developed in the 1950s and become,along with its
2-D extensions,the standard com-pression approach in facsimile (FAX)coding.compression is achieved by
eliminating a simple from of spatial redundancy-groups of identical intensi-ties.when there are few(or
no)runs of identical pixels,run –length encoding results in data expansion.
 The BMP file format uses a run-length encoding in which image data is represented in two different
modes:encoded and absolute-and either mode can occur anywhere in the image.In encoded mode,a two byte
RLE representation is used.The first byte specifies the numberof consecutive pix-els that have the color
index contained byte.The 8-bit color index selects the run’s intensity(color or gray value)from a table of256
pos-sible intensities.
 In absolute mode,the first byte is0and the second byte signals one of four possible conditions,as shown in
Table 8.8when the second byte is 0 or 1,the end of a line or the end of the image has been reached.If it is
2,the next two bytes contain unsigned horizontal and vertical offsets to a new spatial position (and pixel)in
the image.If the second byte is between 3 and 255,it specifies the number of uncompressed pixels that
follow-with each subsequent byte containing the color index of one pixel.The total number of bytes must
be aligned on a 16-bit word boundary.
 An uncompressed BMP file (saved using photoshop )of the 512×512×8bit image shown in fig.89(a)requires
263,244bytes of memory.compressed using BMP’sRLE option,the file expands to 267,706bytes-and the
compres-sion ratio is c=0.98.There are not enough equal intensity runs to make run-length compression
effective ;a small amount of expansion occurs. For the image in Fig.8.1(c),however ,the BMP RLE option
results in a compression ratioc=1.35.

58
 Run –length encoding is particulary effective when compressing binary im-ages.Because there are only two
possible intensities (black and whith ).adjacent pixels are more likely to be identical.In addition ,.each
image row can be repre-sented by a sequence of lenghts only-rather than length-intensity pairs as was used
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

in Example 8.8 the basic idea is to code each contiguous group(i.e.,run)of 0sor 1s encountered in a left to
right scan of a row by its length and to estab-lish a convention for determining the value of the run.The
most common conventions are(1)to specify the value of the first run of each row,or (2)to assume that each
row begins with a white run,whose run length may in fact be zero.
 Although run-length encoding is in itself an effective method of compress –ing binary images,additional
compression can be achieved by variable-length coding the run lengths themselves .The black and white run
lengths can be coded separately using variable _length codes that are specifically tailored to their own
statistics.for example,letting symbol a represent a black run of length j.we can estimate the probability that
symbol a was emitted by an imaginary black run –length source by dividing the number of black run
lengths

59
 of length j in the entire image by the total number of black runs.An estimate of the entropy of this black run-
length source, denoted H0,follows by substi-tuting these probabilities into Eq.(8.1.6).A similar argument
holds for the en-tropy of the white runs,denoted H.The approximate run-length entropy of the image is then
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

 Where the variables L0 and L1 denote the average values of black and white run
lengths,respectively .Equation(8.2-7)provides an estimate of the average number of
bits per pixel required to code the run lengths in a binary image using a variable-
length code.
 Two of the oldest and most widely used image compression standards are the CCITT
Group 3 and 4 standards for binary image compression.Al-though they have been
used in a variety of computer applications,they were originally designed as facsimile
(FAX)coding methods for transimitting doc-uments over telephone networks.The
Group3 standard uses a 1-D run-length coding technique in which the last K-1 lines of
each group of k lines (for k=2or4)can be optionally coded in a 2-D manner.The Group
4 standard is a simplified or streamlined version of the Group3 standard in which
only 2-d coding is allowed.Both standards use the same 2-d coding approach,which is
two_dimensional in the sense that information from the previous line is used to
encode the current line.Both 1-D and 2-D coding are disussed next.

60
One-dimensional CCITT compression
 In the 1-D CCITT Group 3 compression standard,each line of an image is encoded as a series of variable –
length Huffman code words that represent the run lengths of alternating white and back runs in a left –to
right scan of the line.The compression method employed is commonly referred to as Modified Huffman
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

(Mh)coding.The code words themselvese of two types ,which the standard refers to as terninating codes and
makeup codes.
 If run length r is less than 63,a terminating code standard specifies dofferent ter-minating codes for black
and white runs.If r>63.two codes are used-a makeup code for quotient [r|64]and terminating code for
remainder r mod 64.Makeup cides are listed in Table A.2 and may or may not depend on the intensity(black
or white)of the run being coded.if [r\64]<1792,separate black and white run makeup codes are
specified;otherwise,makeup codes are independent of run intensity.The standard requires that each line
began with a whith run –length code word,which may in fact be 00110101 ,the code for a white run of
length zero
 .finally ,a unique end-of-line(EOL)code word 000000000001 is used to terminate each line ,as well as to
signal the first line of each new image.The end of a sequence of images is indicated by six consecutive
EOLs.

61
Two-One-dimensional CCITT compression
 The 2-D compression approach adopted for both the CCITT Group 3 and 4 standards is a line-by-line
method in which the position of each black-to-white or white-to-black run transition is coded with respect
to the position of a reference element a0 that is situated on the current coding line.the previously coded line
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

is called the reference line;the reference line for the first line of each new image is an imaginary white
line.the 2-D coding technique that is used is called Relative Element Address Designate(READ)coding.In
the Group 3 standard,one or three READ coded lines are allowed between sus-cessive MH coded lines and
the technique is called Modified READ(MR) coding.In the Group 4 standard,a greater number of READ
coded lines are al-lowed and the metod is called Modified Modified READ(MMR)coding.As was
previously noted,the coding is two-dimensional in the sense that informa-tion from the previous line is used
to encode the current line.Two-dimensional transforms are not involed.
 Figure 8.14 shows the basic2-D coding process for a single scan line.Note that the initial steps of the
procedure are directed at locating several key changing elements:a 1,a1,a2,b1,and b2.A changing element is
defined by the standard as a pixel whose value is different from that of the previous pixel on the same
line.the most important changing element is a 0(the reference ele-ment),which is either set to the location of
an imaginary white changing ele-ment to the left of the first pixel of each new coding line or determined
from the previous coding mode.coding modes are discussed in the following para-graph.After a 0 is
located,a1­is identified as the location of the next changing element to the right of a­1on the coding line,b­1 as
the changing element of the opposite value (of a 0 )and to the right of a0on the reference (or previ-
ous)line,and b2 as the next changing element to the right of b 1on the refer-ence line.if any of these changing
elements are not detected,they are set to the location of an imaginary pixel to the right of the last pixel on
the appro-priate line.Figure 8.15 provides two illustrations of the general relationships between the various
changing elements.

62
Two-One-dimensional CCITT compression -2
 After identification of the current reference element and associated change –ing elements,two simple tests

are performed to select one of three possible coding modes:pass mode,vertical mode ,or horizontal
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

mode.The initial test ,which corresponds to the first branch point in the flowchart in fig.8.14,com-pares the
location of b2 to that of a1.the second test ,which corresponds to the second branch point in

fig.8.14.computes the distance (in pixels)between the locations of a 1 and b1and compares it against
3.depending on the outcome of these tests,one of the the three outlined coding blocks of fig.8.14 is entered
and the appropriate coding procedure is executed.A new reference element is then established ,as per the
flowchart,in preparation for the next coding iteration.
 Table 8.9 defenes the specific codes utilized for each of the three possible coding modes.In pass

mode,which specifically excludes the case in which b 2 is directly above a1­only the pass mode code word
0001 is needed.As figh.8.15 (a)shows thise mode identifies white or black reference line runs that do not
overlap

63
64
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

 The current white or black coding line runs. In horizontal coding mode,the dis-tances from a 0 to a1 and a1 to

a2 must be coded in accordance with the termina-tion and makeup codes of tables A.1 and A.2 of Appendix
A and then appended to the horizontal mode word 001.This is indicated in table 8.9 by the notation
001+M(a0a1)+M(a1­a2),where a0a2 denote the dis-tances from a0 to
­ a1 and a­1to a2,respenctively.finally ,in

vertical coding mode,one of six special variable –length codes is assigned to the distance between a 1 and

b1.figure 8.15(b)illustrates the parameters involved in both horizontal and vertical mode coding .the
extension mode code word at the bottom of table 8.9 is used to enter an optional facsimile coding mode.for
example,the 0000001111 code is used to initiate an uncompressed mode of transmission .
 Although fig 8.15(b)is annotated with the parameters for both horizontal and vertical mode coding (to

facilitate the discussion above),the depicted pat-tern of black and white pixels is a case for vertical mode
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

coding.that is,be –cause b2 is to the right of a1 ,the first (or pass mode) test in fig.8.14 fails.the second
test,which determines whether the vertical or horizontal coding mode

 Is entered,indicates that vertical mode coding should be used,because the dis-tance from a 1 to b1 is less than

3.in accordance with table 8.9 ,the appropriate code words is 000010,implying that a 1 is two pixels left of b­

1 .in preparation for the next coding iteration ,a 0 is moved to the location of a­1.

66
 Figure 8.16 (a) is a 300 dpi scan of a 7×9.25 inch book page displayed at about 1\3 scale.Note that about

half of the page contains text ,around 9% is oc-cupied by a halftone image ,and the test is white space .A
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

section of the page is enlarged in fig .8.16(b) .keep in mind that we are dealing white a binary image;the
illusion of gray tones is created,as was described in section 4.5.4 by the halftoning process used in
printing .if the binary pixels of the image in fig.8.16(a) are stored in groups of 8 pixels per byte,the
1952×2697 bit scanned image ,commonly called a docoment,requires 658,068 bytes.An uncompressed PDF
file of the document (created in photoshop) requires 663,445 bytes.CCITT Group3 compression reduces the
file to 123,497 bytes-resulting in a compression ratio c=5.37;CCITT Group 4 compression reduces the file
to 110,456 bytes,increasing the compression ratio to about 6.
8.2.6 Symbol-Based Coding
 In symbol –or token –based coding ,an image is represented as a collection of frequently occurring sub-
images ,called symbols.Each such symbol is stored in a symbol dictionary and image is coded as aset of
triplets {[x1,y1,t1),(x2,y2,t 2),…..}where each (xi,yi)pair specifies the of a symbol in the image and tokent is the
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

address of the symbol or sub-image in the dictionary. That is ,each triplet represents an instance of a
dictionary symbol in the image.storing repeated symbols only once can compress images significantly-
particulary in document storage and retrieval applications ,where the sym-bols are often character bitmaps
that are repeated many times.
8.2.6 Symbol-Based Coding -2
Consider the simple bilevel image in fig.8.17(a).it contains the single word,banana,which is composed of three
unigue symbols:a b,three a’s,and two n’s .Assuming that the b is the first symbol identified in the coding
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

process,its 9×7 bitmap is stored in location 0 of the symbol dictionary.As fig.8.17(b)shows ,the token
identifying the b bitmap is 0.thus,the first triplet in the encoded image’s representation [see fig.8.17(c) is
(0.2.0)-indicating that the upper-left corner (an arbitrary convention) of the rectangular bitmap representing
the b symbol is to be placed at location (0,2)in the decoded image.After the bitmaps for the a and n symbols
have been identified and added to the dictionary,the remainder of the image can be encoded with five
additional triplets.A s long as the six triplets required to locate the symbols in the image,together with the
three bitmaps required to define them,are smaller than the original image,com-pression occurse .in thise
case,the starting image has9×51×1 or 459 bits and,assuming that each triplet is composed of 3 bytes,the
compressed representa-tion has (6×3×8) +[ (9×7)×(6×7)+(6×6)or 285 bits;the result-ing compression ratio
c=1.61.To decode the symbol-based representation in fig .8.17(c),you simply read the bitmaps of the
symbols specified in the triplets from the symbol dictionary andplace them at the spatial coordinates
specified in each triplet.
8.2.6 Symbol-Based Coding -3
 Symbol-based compression was proposed in the early 1970 s (Acher and Nagy [1974]),but has become

practical only recently.Advances on symbol matching algorithms (see chapter 12) and increased CPU
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

coputer process-ing speeds have made it possible both to select dictionary symbols and find where they
occur in an image in a timely manner.And like many other com-pression methods ,symbol –based decoding
is significantly faster than encoding .finally ,we note that both the symbol bitmaps that are stored in the
dictionary and the triplets used to reference them can themselves be encoded to further improve
compression performance.if-as in fig.8.17-only exact symbol matches are allowed ,the resulting
compression is lossless;if small differences are permitted,some level of reconstruction error will be present.

70
JBIG2 compression
 JBIG2 is an international standard for bilevel image compression .By segmenting an image into overlapping

and\or non—overlapping regions of text,halftone, and generic content ,comprssion techniques that are
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

specifically optimized for each type of content are employed:


 Text regions are composed of characters that are ideally suited for a symbol-based coding approach .typically ,each

symbol will correspond to a charac-ter bitmap-a subimage representing a character of text .there is normally only one
character bitmap(or subimage) in the symbol dictionary for each upper-and lowercase character of the font being
used .for example,there would be one “a” bitmap in the dictionary,one “A” bitmap,one “b”bitmap,and so on.
 In lossy JBIG2 compression .often called perceptually lossless or visually lossless,we neglect differences between

dictionary bitmaps (i.e.,the reference character bitmaps or character templates)and specific in-stances of the
corresponding characters in the image.in lossless compres-sion,the differences are stored and used in conjunction with
the triplerts encoding each character (by the decoder)to produce the actual image bitmaps.All bitmaps are encoded either
arithmetically or using MMR(see section 8.2.5);the triplets used to access dictionary entries are either arithmetically or
Huffman encoded.
 Halftone regions are similar to text regions in that they are composed of patterns arranged in a regular grid.the symbols

that are stored in the dic-tionary,however,are not character bitmaps but periodic patterns that rep-resent intensities (e.g.,
of a photograph)that have been dithred to produce bilevel images for printing.
 Generic regions contain non-text,non-halftone information,like line art and noise,and are compressed using either

arithmetic or MMR coding .


JBIG2 compression - 2
 As is true of many image compression standards ,JBIG2 defines decoder be-havior.it does not explicitly
define a standard encoder ,but is flexible enough to allow various encoderdesigns.although the design of the
encoder is left un-specified,it is nevertheless important ,because it determines the level of com-pression that
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

is achieved .After all ,the encoder must segment the image into regions,choose the text and halftone
symbols that are storedin the dictionary-ies,and decide when those symbols are essentially the same as,or
different from ,potential instances of the symbols in the image .the decoder simply uses that information to
recreate the original image.
 Consider again the bilevel image in fig.8.16(a).figure 8.18(a)shows a re-constructed section of the image
after lossless JBIG2 encoding (by a comer –cially available document compression application ).it is an
exact replica of the original image.Note that the ds in the reconstructed text vary slightly .de-spite the fact
that they were generated from the same d entry in the dictionary. The differences between that d and the ds
in the image were used to refine the output of the dictionary.the standard defines an algorithm for
accomplishing
 This during the decoding of the encoded dictionary bitmaps .for the purposes of our discussion ,you can
think of it as adding the difference between a diction-nary bitmap and a specific instance of the
corresponding character in the image to the bitmap read from the dictionary.

72
JBIG2 compression - 3
 Figure 8.18(b)is another reconstruction of the area in (a)after perceptu-ally lossless JBIG2

compression.Note that the ds in thise figure are identical. The have been copied directly from the symbol
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

dictionary .the reconstruct-tion is called perceptually lossless because the text is readable and the font is
even the same .the small differences-shown in fig.8.18(c)-between the ds in the original image and the d in
the dictionary are considered unimportant because they do not affect readability.Remember that we are
dealing with bilevel images,so there are only three intensities in fig .8.18 (c).intensity 128 indicates areas
where there is no difference between the corresponding pixels of the images in figs.8.18(a) and
(b);intensities 0(black)and 255(white) in-dicate pixels of opposite intensities in the two images-for
example,a black pixel in on image that is white in the other ,and vice versa.
 The lossless JBIG2 compression that was used to generate fig..8.18(a) re-duces the original 663,445 byte

uncompressed PDF image to 32,705 bytes;the compression ratio is c=20.3.perceptually lossless JBIG2
compression re-duces the image to 23,913 bytes .increasing the compression ratio to about 27.7.these
compressions are 4to5 times greater than the CCITT Group 3 and4 results from Example 8.10.
8.2.7 Bit-Plane Coding
 The run-length and symbol-based techniques of the previous sections can be applied to images with more
than two intensities by processing their bit planes individually .the technique,called bit-plane coding,is
based on the concept of decomposing a multilevel(monochrome or color)image into a series of binary
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

images (see section 3.2.4)and compressing each binary image via one of sev-eral well-known binary
compression methods. In this section,we describe the two most popular decomposition approaches.
 The intensities of an m-bit monochrome image can be represented in the form of the base-2 polynomial

 Based on this property, a simple method of decomposing the image into a col-lection
of binary images is to separate the m coefficients of the polynomial into m 1-bit bit
planes.As noted in section 3.2.4 ,the lowest order bit plane(the plane corresponding
to the least significant bit)is generated by collecting the a 0 bits of each pixel,while the
highest order bit plance contains the am­-1 bits or coef-ficients.in genereal ,each bit
plance is constructed by setting its pixels equal to the values of the appropriate bits
or polynomial coefficients from each pixel in the original image .the inherent
disadvantage of thise decomposition approach is that small changes in intensity can
have a significant impact on the complexity of the bit planes .if a pixel of intensity
127(01111111 is adjacent to a pixel of intensity 128(10000000),for instance,every bit
plance will contain a corresponding 0 to 1 (or 1 to 0) transition .for example,because
the most significant bits of the binary codes for 127 and 128 are different ,the
highest bit plane will contain a zero-valued pixel next to a pixel of value 1,creating a
0 to 1 (or 1 to 0)transition at that point.
8.2.7 Bit-Plane Coding - 2
 An alternative decomposition approach (which reduces the effect of small intensity variations ) is to first
represent the image by an m-bit Gray code.the m-bit Gray code g m-1….g2g1g0 that corresponds to the
polynomial in EQ.(8.2.8) can be computed from
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

 Here, , denotes the exclusive OR operation.thise code has the unique prop-erty that
successive code words differ in only one biy position.thus,small changes in intensity
are less likely to affect all m bit planes.for instance,when intensity levels 127 and 128
are adjacent,only the highest order bit plane will contain a 0 to 1 transition.because
the Gray codes that correspond to127 and 128 are 11000000,respectively.
 Figures 8.19 and 8.20 show the eight binary and Gray-coded bit planes of the 8-bit
monochrome image of the child in fig.8.19(a).Note that the high-order bit planes are
far less complex than heir low-order counterparts.that is they contain large uniform
areas of significantly less detail,busyness,or ran-domness.in addition,the Gray-coded
bit planes are less complex than the cor-responding binary bit planes.Both
observations are reflected in the JBIG2 coding results of table 8.10 Note for
instance,that the a5 and g5 results are

75
77
8.2.7 Bit-Plane Coding - 3
 Significantly larger than the a 6 and g6 compressions; and that both g­6 aresmaller than their a­5 and a6

counterparts.thise trend continues throughout the table ,with the single exception of a a .Gray-coding
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

provides a compression advantage of about 1.06:1 on average.combined together,the Gray –coded files
compress the original monochrome image by 678,676\475,964 or 1.43:1;the non –Gray-coded files
compress the image by 678,676\503,916 or 1.35:1.
 Finally ,we note that the two least significant bits in fig 8.20 have little ap-parent structure.Because thise is

typical of most 8-bit monochrome images,bit –plane coding is usually restricted to images of 6 bits\pixel or
less.JBIGI,hthe predecessor to JBIG2 ,imposes cuch a limit.

78
8.2.8 Block Transform Coding
 In thise section,we consider a compression technique that divides an image into small non-overlapping
blocks of equal size (e.g.,8×8)and processes the blocks independently using a 2-D transform.in block
transform coding ,a re-versible ,linear transform(cuch as the fourier transform) is used to map each block or
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

subimage into a set of transform coefficients ,which are then quantized and coded.for most images,a
significant number of the coefficients have small magnitudes and can be coarsely guantized (or discarded
entirely)with little image distortion.Avariety of transformations,including the discrete fourier
transform(DFT)of chapter 4,can be used to transform the image data.
 Figure 8.21 shows a typical block transform coding system.the decoder im-plements the inverse sequence of
steps(with the exception of the quantization function)of the encoder ,which performs four relatively
straightforward oper-ations:subimage decomposition,transformation ,quantization,and coding.An M ×N
input image is subdivided first into subimages of size n×n ,which are then transformed to generate MN\
 Subimage transform arrays,each of size n×n .the goal of the transformation process is to decorrelate the
pixels of each subimage,or to pack as much information as possible into the smallest number of transform
coefficients.the quqntization stage then selectively elim-inates or more coarsely quantizes the coefficients
that carry the least amount of information in a predefined sense(several methods are discussed later in the
section).these coefficients have the smallest impact on reconstructed subimage quality .the encoding process
terminates by coding(normally using a variable-length code)the quantized coefficients.Any or all of the
transform encoding steps can be adapted to local image content,called adaptive trans-form coding ,or fixed
for all subimages,called nonadaptive transform coding.

79
Transform selection
 Block transform coding systems based on a variety of discrete 2_D transforms have been constructed and\
or studied extensively.the choice of a particular transform in a given application depends on the amount of
reconstruction error that can be tolerated and the computational resources available.com-pression is
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

achieved during the quantization of the transformed coefficients (not during the transformation step).
 With reference to the discussion in section 2.6.7 consider a subimage g(x,y) of size n×n whose
forvard ,discrete transform ,T (u,v),can be expressed in terms of the general relation

 For u,v=0,1,2,…,n-1.Given t(u,v),g(x,y)similarly can be obtained using the


generalized inverse discrete transform

 For x,y =0,1,2,…,n-1 .in these equations,r(x,y,u,v)and s(x,y,u,v) are callled the
forward and inverse transformation kernels,respectively.for reasons that will become
clear later in the section ,they also are referred to as basis function or basis
images.the t(u,v)for u,v=0,1,2,….n-1 in eq.(8.2-10)are called transform
coefficients;they can be viewed as the ex-pansion coefficients-see section 7.2.1-of a
series expansion of g(x,y) with respect to basis functions s(x,y,u,v).
As explained in section 2.6.7 ,the kernel in Eg .(8.2-10)is separable if

80
Transform selection - 2
In addition,the kernel is symmetric if r 1 is functionally equal to r2 .in thise case,Eq.(8.2-12) can be expressed in
the form
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

 Identical comments apply to the inverse kernel if r(x,y,u.v)is replaced by s(x,y,u,v) in Eqs.(8.2-12) and (8.2-
13).it is not difficult to show that a 2-D transform with a separable kernel can b computed using row-
column or colimn-row passes of the corresponding 1-D transform.in the manner ex-plained in section
4.11.1.
 The forward and inverse transformation kernels in Eqs.(8.2,10)and (8.2-11)determine the type of transform
that is cmputed and the overall computation-al complexity and reconstruction error of the block transform
coding system in which they are employed.the best known transformation kernel pair is

Ans

 Where j=-1 .these are the transformation kernels defined in Eqs.(2.6-34) and (2.6-
35)of chaper 2 eith M=N=n .substituting these kernels into,Eqs.(8.2-10)and (8.2-
11)yields a simplified version of the discrete fourier transform pair introduced in
section 4.5.5.
 A computationally simpler transformation that ie also useful in transform
coding,called the walsh-Hadamard transform(WHT),is drived from the functionally
identical kernels
81
Transform selection - 3
 Where n= .the summation in the exponent of thise expression is performed in modulo 2 arithmetic and b k
(Z) is the kth bit (from right to left)in the binary reprentation of z. if m=3 and z=6(110 in binary),for
example,b0(z)=0,b1(z)=1,and b2(z)=1.the pi(u) in Eq.(8.2-16)are computed using:
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

 Where the sums,as noted previously,are performed in modulo 2 arithmetic.


 Similar expressions apply to pi(v).
 Unlike the kernels of the DFT,which are sums of sines and cosines [see Eqs.(8.2-
14)and (8.2-15)],the walsh –hadamard kernels consist of alternating plus and minus
1s arranged in a checkerboard pattern.figure 8.22 shows the kernel for n=4 Each
block consists of 4×4=16 elements (subsquares).

82
Transform selection - 4
 With denotes +1 and black denotes-1.to obtain the top left block,we let u=v=0 and plot values of
r(x,y,0,0)for x,y=0,1,2,3.All values in this case are+1 .the second block on the top row is a plot of values of
r(x,y,0,1) for x,y=0,1,2,3,and so on As already noted,the importance of the walsh-Hadamard transform is its
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

simplicity of implantation-all kernel values are +1 or-1.


 One of the transformations used most frequently for image compression is the discrete cosing
transform(DCT).it is obtained by substituting the follow-ing (equal) kernels into Eqs.(8.2-10) and(8.2_11)

 Where

 And similarly for a(v).Figure 8.23 shows r(x,y,u,v) for the case n=4.the computation follows the same
format as explained for fig .8.22,white the dif-ference that the values of r are not integers.in fig .8.23,the
lighter intensity values correspond to larger values of r.

83
Transform selection - 5
 Figures 8.24(a)through (c)show three approximations of the 512×512monochrome image in fig .8.9(a).these
pictures were obtained by dividing the original omage into subimages of size 8×8 ,representing each
subimage using one of the transforms just described (i.e.,the DFT,WHT,or DCT transform of the truncated
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

coefficient arrays.

84
Transform selection - 6
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

In each case,the 32 retained coefficients were selected on the basis of max-imum magnitude.Note that in all
cases,the 32 discarded coefficients had little visual pmpact on the quality of the reconstructed image,their
elimination ,however ,was accompanied by some mean-square error,which can be seen in the scaled error
images of figs.8.24(d)through(f).the actual rms errors were2.32,1.78,and 1,13 intensities ,respectively

85
Transform selection - 7
The small differences in mean-square reconstruction error noted in the pre-ceding example are related directly
to the energy or information packing prop-erties of the transforms employed .in accordance with EQ.(8.2-
11) an n×n subimage g(x,y)can be expressed as a function of its 2-D transform T(u,v):
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

 For x,y=0,1,2,….n-1 .Bcause the inverse kernel s(x,y,u,v)in Eq.(8.2-20)depends only on the indices
x,y,u,v,and not on the values of g(x,y)or T(u,v),it can be viewed as defining a set of set of basis functions or
basis imges for the defined by Eq .(8.2-20).thise interpretation becomes clearer if the notation used in
Eq .8.2.20) is modified to obtain

 Where G is a an n×n matrix conaining the pixels of g (x,y) and

86
Transform selection - 8
 then G the matrix containing the pixels of the input subimage,is explicitly de-fined as a linear combination
of matrices of size n×n ;that is ,the S uv for u,v=0,1,2,…n-1 in Eq.(8.2-22).these matrices in fact are the
basis im-ages (or functions) of the series expansion in Eq.(8.2-20);the associated t(u,v)are the
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

expansion .Figures 8.22 and 8.23 illustrate graphi-cally the WHT and DCT basis images foe the case of n=4.
 If we now define a transform coefficient masking function

 For u,v =0,1,2,…,n-1, an approximation of G can be obtained from the truncated expansion

Where x(u,v) is constructed to eliminate the basis images that make the smallest contribution to the total sum in
Eq .(8.2-21). The mean –square error between subimage G and approximation C then is

87
Transform selection - 9
 Where|| G-Ĝ|| is the norm of matrix (G-G) and t is the variance of the coefficient at transformal location
(u,v).the final simplification is based on the orthonormal nuture of the basis images and the assumption that
the pixels of G are generated by a random process with zero mean and known covari-ance.the total mean-
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

square approximation error thus is the sum of the vari-ances of the discarded transform coefficients;that
is ,the coefficients for which x(u,v)=0,so that [1-x(u,v)] in Eq.(8.2-25) is 1.transformations that re-distribute
or pack the most information into the fewest coefficients provide the best subimage approximations and ,the
smallest reconstruct-tion errors.finally ,under the assumptions that led to Eq.(8.2-25),the mean-square error
of the MN\ subimages of an M×N image are identical.thus the mean-square error (being a meaqsure of
average error) of the M×N image equals the mean –sguare error of a single subimage.
 The earlier example showed that the information packing ability of the DCT is superior to tha of the DFT
and WHT. Although thise condition uaually holds for most images,the karhunen-Loéve transform (see
chqapter 11),not the DCT ,is the optimal transform in an information packing sense.thise is due to the fact
that the KLT minimizes the mean-square error in Eq.(8.2-25)for any input image and any number of
retained coefficients(Kramer and Mathews[1956]).However,because the KLT is data dependent,obtaining
the KLTR basis images for each cubimage ,in general ,is a nontrivial computational task.
 For thise reason ,the KLT is used infrequently in practice for image compression.
 Instead,a transform,such as the DFT,WHT,or DCT,whose basis images are fixed(input
independent),normally is used.of the possible imput independent transforms,the nonsinusoidal
transforms(such as the WHT transform) are the simplest to implement.the sinusoidal transforms(such as the
DFT or DCT) more closely approximate the information packing ability of the optimal KLT.

88
Transform selection - 10
Hence ,most transform coding systems are based on the DCT,which provides a good compromise between
information packing ability and computational complexity.in fact,the properties of the DCT have proved to
be of sucsh practi-cal value that the DCT has become an international standard for transform cod-ing
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

systems.compared to the other input independent transforms,it has the advantages of having been
implemented in a single integrated circuit,packing the most information into the fewest coefficients (for
most images),and mini-mizing the block-like appearance,called blocking artifact,that results when the
boundaries between subimages become visible.thise last property is particularly important in comparisons
with the other sinusoidal transforms.As fig .8.25(a) shows,the implicitn-point periodicity(see section
4.6.3)of the DFT gives rise to boundary discontinuities that result in result in substantial high-frequency
transform

Content .when the DFT transform coefficients are truncated or quantized.the Gibbs
phenomenon causes the boundary points to take on erroneous values,which appear
in an image as blocking artifact.that is ,the boundaries between ad-jacent subimages
become visible because the boundary pixels of the subimages assume the mean
values of discontinuities formed at the boundary points[see fig .8.25(a)].the DCT of 89
fig.8.25(b)reduces thise effect,because its implicit 2n-point periodicity does not
Subimage size Selection
 A nother significant factor affecting transform coding error and computational complexity is subimage
size .in most applications ,images are subdivided so that the correlation (redundancy)between adjacent
subimages is reduced to some acceptable level and so that n is an integer power of 2 where,as before,n is the
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

subimage dimension.the latter condition simplifies the computation of the subimage transforms (see the
base-2)successive doubling method dis-cussed in section 4.11.3)in general,both the level of compression
and com-putational complexity increase as the subimage size increases.the most popular subimage sizes are
8×8 and 16×16.
 Figure 8.26 illustrates graphically the impact of subimage size on transform coding reconstruction error.the
data plotted were obtained by dividing the monochrome image of fig.8.9(a)into subimages of size n×n.for
n=2,4,8,16,….,256,512,computing the transform of each subimage,trun-cating 75% of the resulting
coefficients,and taking the inverse transform of the truncated arrays.Note that the Hadammard and cosine
curves flatten as the size of the subimage becomes greater than 8×8 ,whereas the fourier reconstruction
Subimage size Selection - 2
 Error continues to decrease in thise region.As n further increases,the fourier re-construction error crosses
the walsh –Hadamard curve and approaches the cosine result.thise result is consistent whith the theoretical
and experimental findings re-ported by Netravali and limp[1980]and by pratt[1991] for a 2-D Markov
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

image source.
 All three curves intersect when 2×2 subimages are used .in thise case,only one of the four coefficients(25%)
of each transformed array was retained.the coefficient in all cases was the dc component,so the inverse
transform simply replaced the four subimage pixels by their average value[see Eq .(4.6-21)].thise condition
is evident in fig .8.27(b),which shows a zoomed portion of the 2×2 DCT result.note that the bloking artifact
that is prevalent in thise result decreases as the subimage size increases to 4×4 and 8×8 in figs.8.27(c)and
(d).figure 8.27(a)shows a zoomed portion of the original image for reference.
Bit Allocation
 The reconstruction error associated with the truncated series expansion of eq.(8.2-24)is a
function of the number and relative importance of the
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

Transform coefficients that are discarded,as well as the precision that is used
to represent the retained coefficients.in most transform coding sys-
tems,the retained coefficients are selected [that is ,the masking function of
Eq.(8.2-23)is constructed]on the basis of maximum variance ,called zonal
coding,or on the basis of maximum magnitude,called threshold coding.the
overall process of truncating,quantizing,and coding the coefficients of a
transformed subimage is commonly called bit allocation

92
Bit Allocation - 2
Figures 8.28 (a) and (c)show two approximations of fig .8.9(a) in which 87.5% of the DCT coefficients of each
8×8 subimage were discarded.the first result was obtained via threshold coding by keeping the eight largest
transform coefficients,and the second image was generated by using a zonal coding approach.in the latter
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

case,each DCT coefficient was considered a random variable whose distribution could be computed over
the ensemble of all transformed subimages .the 8 distributions of largest variance (12.5% 0f the 64
coefficients in the transformed 8×8 subimage)were located and used to determine the coordinates,u and v,of
the coefficients,T(u,v),that were re-tained for all subimages.Note that threshold coding difference image
offig.8.28(b)contains less error than the zonal coding result in fig.8.28(d).Both images have .Both images
have been scaled to make the errors more visible .the correcponding rms errors are 4.5 and 6.5 intensities,
respectively.

93
Bit Allocation - 3
 Zonal coding implementation zonal coding is based on the information theory concept of viewing
information as uncertainty .therefore the transform coefficients of maximum variance carry the most image
information and should be retained in the coding process.the variances themselves can be cal-culated
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

directly from the ensemble of MN\ transformed subimage arrays ,as in the preceding example,or based on
as assumed image model (say,a markerkov autocorrelation function). In either case ,the zonal sampling
process can be viewed,in accordance with Eq .(8.2-24) , as multiplying each t (u,v) by the cor-responding
element in a zonal mask ,which is constructed by placing a 1 in the locations of maximum variance and a 0
in all othere locations.coefficients of maximum variance usually are located around the origin of an image
trance-form,resulting in the typical zonal mask shown in fig .8.29(a).
 The coefficients retained during the zonal sampling process must be quan-tized and coded .so zonal masks
are sometimes depicted showing the number of bits used to code each coefficient [fig.8.29(b).]in most
cases,the coefficients are allocated the same number of bits,or some fixed number of bits is distributed
among them unequally.in the first case ,the coefficients generally are normal-ized by their atandard
deviations and uniformaly quantized .in the second case ,a quantizer,such as an optimal Lioyd-max
quantizer(see optimal quantizers in section 8.2.9),is designed for each coefficient.to construct the required
guan- tizers,the zeroth or dc coefficient normally is modeled by a Ravleigh density function .whereas the
remaining coefficients are modeled by a laplacian or Gaussian density .the number of
quantization levels(and thus the number of bits)allotted to each quantizer is made
proportional to log 2 t (u,v) .thus the re-tained coefficients in Eq (8.2-24)-which (in the
context of the current discus-sion)are selected on the basis of maximum variance –
are assigned bits in proportion to the coefficient variances.

94
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

Bit Allocation - 4

95
Threshold Coding implementation
Threshold ciding implementation zonal coding usually is implemented by using a single fixed mask for all
subimages .threshold coding ,however ,is in-herently adaptive in the sense that the location of the transform
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

coefficients retained for each subimage vary from one subimage to another .in fact,threshold coding is the
adaptive transform coding approach most often usedin practice because of its computational simplicity.the
underlyng concept is that for any subimage ,the transform coefficients of largest magnitude make the most
significant contribution to reconstructed subimage quality,as demonstrated in the last example.Because the
locations of the maximum co-efficients vary from one subimage to another,the elements of x(u,v) T
(u,v)normally are recordered(in a predefined manner)to form a 1-d ,run –length coded sequence.figure
8.29(c) shows a typical thresholdmask for one subim-age of a hypothetical image .this mask provides a
convenient way to visualize the threshold coding process for the corresponding subimage,as well as to
mathematically describe the process using Eq.(8.2-24).when the mask is ap-plied [via Eq .(8.2.24)]to the
subimage for which it was derived,and the re-sulting n×n array is reordered to form an -element coefficient
sequence in accordance with the zigzag ordering pattern of fig.8.29(d) ,the reordered1-D sequence contains
several long rung runs of 0s [the zigzag pattern becomes ev-ident by starting at 0 in fig .8.29(d) and
following the numbers in sequence]. These runs normally are run-length coded.the nonzero or retained
coeffi-cients,corresponding to the mask locations that contain a1,are represented using a variable-length
code.
Threshold Coding implementation - 2
There are three basic ways to threshold a transformed subimage or,stated differently ,to create a subimage
threshold masking function of the form given in Eq .(8.2.23): ( 1)A single global threshold can be applied to
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

all subimages;(2) a different threshold can be used for each cubimage;or (3) the threshold can be varied as a
function of the location of each coefficient within the subim-age.in the first approach ,the level of
compression differs from image to image,depending on the number of coefficients that exceed the global
thresh-old.in the second ,called N-largest coding,the same number of coefficients is discarded for each
subimage.As a result,the code rate is constant and knownin advance.the third technique,like the first ,results
in a variable code rate,but offers the advantage that thresholding and quantization can be combined by
replacing x(u,v)T(u,v)in Eq .(8.2-24)with
Threshold Coding implementation - 3
Where (u,v)is a thresholded and quantized approximation of T(u,v) ,and Z(u,v)is an element of the transform
normalization array
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

Before a normalized (thresholded and quantized )subimage transform, (u,v),can be inverse transformed to
obtain an approximation of subimage g(x,y) ,it must be multiplied by Z (u,v).The resulting denormalized
zrray,de-noted (u,v) is an approximation of (u,v) :

 The inverse transform of (u,v) yields the decompressed subimage approxi –matio
 Figure 8.30(a)depicts Eq .(8.2-26)graphically for the case in which Z(u,v)is assigned particular value ­c.Note
that (u,v)assumes integer value k if and only if
Threshold Coding implementation - 4
If Z (u,v)>2T(u,v),then (u,v)=0 and the transform coefficient is com-pletely truncated or discarded .when
(u,v)is represented with a variable-length code that increases in length as the magnitude of k increases,the
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

number of bits used to represent T (u,v)is controlled by the value of c.thuse the elements of Z

Can be scaled to achieve a variety of compression levels.figure 8.30(b) shows a typical


normalization array.thise array ,which has been used extensively in the JPEG
standardization efforts(see the next section),weighs each coefficient of a transformed
subimage according to heuristically determined perceptual or psy-chovisual
importance.
 Figures 8.31(a) through (f) show six threshold –coded approximations of the
monochrome image in fig.8.9(a) .All images were generated using an 8×8 DCT and
the normalization array of fig.8.30(b).the first result ,which provides a compression
ratio of about 12 to 1 (i.e,c=12),was obtained by direct appli-cation of that
normalization array.the remaining results,which compress the original image by
19.30,49,85,and 182 to 1,were generated after multiplying (scaling)the normalization
JPEG
 One of the most popular and comprehensive continuous tone,still frame com-pression standards is the JPEG
standard.It defines three different coding sys-tems: ( 1)a lossy baseline coding system,which is based on the
DCT and is adequate for most compressions;(2)an extended coding system for
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

100
JPEG - 2
 greater compression ,higher precision ,or progressive reconstruction application;and(3)a lossless
independent coding system for reversible compression.to be JPEG compatible,a product or system must
include support for the baseline sys-tem.No particular file format,spatial resolution ,or color space model is
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, New Jersey: Prentice Hall, 3rd edition, 2008.

specified .
 in the baseline system,often called the sequential baseline system,the input and output data precision is
limited to 8 bits,whereas the quantized DCT val-ues are restricted to 11 bits .the compression itself is
performed in three se-quential steps:DCT computation,quantization ,and variable-length code
assignment,the image is first subdivided into pixel blocks of size 8×8 ,
 which are processed left to right ,top to bottom .As each 8×8 block or subimage is encountered ,its 64 pixels
are level-shifted by subtracting the quantity where
 is the maximum number of intensity levels. The 2-D discrete cosine transform of the block is then
computed,quantized in accordance with Eq.(8.2-26),and reordered ,using the zigzag pattern of fig,8.29(d),to
form a 1-D sequence of quantized coefficients.
 Because the one-dimensionally reordered array generated under the zigzag pattern of fig.8.29(d) is arranged
qualitatively according to increasing spatial frequency,the JPEG coding procedure is designed to take
advantage of the long runs of zeros that normally result from the reordering.In particular,the nonzero AC

101

You might also like