0% found this document useful (0 votes)
4 views48 pages

Module-3 Some Basiccompression Methods

The document discusses various basic compression methods, including Huffman coding, arithmetic coding, LZW coding, and their applications in digital image processing. Huffman coding optimally encodes symbols based on their probabilities, while arithmetic coding assigns a single code word to an entire sequence of symbols. LZW coding addresses spatial redundancies in images by using fixed-length code words for variable-length sequences without prior knowledge of symbol probabilities.

Uploaded by

Raju B R
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views48 pages

Module-3 Some Basiccompression Methods

The document discusses various basic compression methods, including Huffman coding, arithmetic coding, LZW coding, and their applications in digital image processing. Huffman coding optimally encodes symbols based on their probabilities, while arithmetic coding assigns a single code word to an entire sequence of symbols. LZW coding addresses spatial redundancies in images by using fixed-length code words for variable-length sequences without prior knowledge of symbol probabilities.

Uploaded by

Raju B R
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 48

Module-3

Some Basic Compression Methods


Huffman Coding
• One of the most popular techniques for removing coding redundancy is
due to Huffman (Huffman [1952]).
• When coding the symbols of an information source individually,
Huffman coding yields the smallest possible number of code symbols
per source symbol.
• In terms of Shannon’s first theorem, the resulting code is optimal for a
fixed value of subject to the constraint that the source symbols be
coded one at a time.
• In practice, the source symbols may be either the intensities of an
image or the output of an intensity mapping operation (pixel
differences, run lengths, and so on).
• The first step in Huffman’s approach is to create a series of source reductions
• by ordering the probabilities of the symbols under consideration and combining
the lowest probability symbols into a single symbol that replaces them in the next
source reduction.
• Figure illustrates this process for binary coding (K-ary Huffman codes can also be
constructed).
• At the far left, a hypothetical set of source symbols and their probabilities are ordered
from top to bottom in terms of decreasing probability values.
• To form the first source reduction, the bottom two probabilities, 0.06 and 0.04, are
combined to form a “compound symbol” with probability 0.1.
• This compound symbol and its associated probability are placed in the first source
reduction column so that the probabilities of the reduced source also are ordered from
the most to the least probable.
• This process is then repeated until a reduced source with two symbols (at the far right)
is reached.
• The second step in Huffman’s procedure is to code each reduced source, starting
with the smallest source and working back to the original source.
• The minimal length binary code for a two-symbol source, of course, are the
symbols 0 and 1.
• As Fig shows, these symbols are assigned to the two symbols on the
• right (the assignment is arbitrary; reversing the order of the 0 and 1 would work
just as well).
• As the reduced source symbol with probability 0.6 was generated by combining
two symbols in the reduced source to its left, the 0 used to code it
is now assigned to both of these symbols, and a 0 and 1 are arbitrarily appended
to each to distinguish them from each other.
This operation is then repeated for each reduced source until the original source is
reached. The final code appears at the far left in FigThe average length of this code I
• Huffman’s procedure creates the optimal code for a set of symbols and
probabilities subject to the constraint that the symbols be coded one at a time.
• After the code has been created, coding and/or error-free decoding is
accomplished in a simple lookup table manner. The code itself is an instantaneous
uniquely decodable block code.
• It is called a block code because each source symbol is mapped into a fixed
sequence of code symbols. It is instantaneous because each code word in a string
of code symbols can be decoded without referencing succeeding symbols. It is
uniquely decodable because any string of code symbols can be decoded in only
one way.
• Thus, any string of Huffman encoded symbols can be decoded by examining the
individual symbols of the string in a left-to-right manner. For the binary code of
Fig, a left-to-right scan of the encoded string 010100111100 reveals that the first
valid code word is 01010, which is the code for symbol a3 The next valid code is
011, which corresponds to symbol a1 Continuing in this manner reveals the
completely decoded message to be a3 a1 a2 a2 a6
Arithmetic Coding
• Unlike the variable-length codes of the previous two sections, arithmetic coding generates
nonblock codes.

• In arithmetic coding, which can be traced to the work of Elias (see Abramson [1963]), a one-
to-one correspondence between source symbols and code words does not exist. Instead, an
entire sequence of source symbols (or message) is assigned a single arithmetic code word.

• The code word itself defines an interval of real numbers between 0 and 1. As the number
of symbols in the message increases, the interval used to represent it becomes smaller and
the number of information units (say, bits) required to represent the interval becomes larger.

• Each symbol of the message reduces the size of the interval in accordance with its
probability of occurrence. Because the technique does not require, as does Huffman’s
approach, that each source symbol translate into an integral number of code symbols
Adaptive context dependent
probability estimates
• With accurate input symbol probability models, that is, models that provide the true
probabilities of the symbols being coded, arithmetic coders are near optimal in the
sense of minimizing the average number of code symbols required to represent the
symbols being coded.
• Like in both Huffman and Golomb coding, however, inaccurate probability models
can lead to non-optimal results.
• A simple way to improve the accuracy of the probabilities employed is to use an
adaptive, context dependent probability model.
• Adaptive probability models update symbol probabilities as symbols are coded or
become known. Thus, the probabilities adapt to the local statistics of the symbols
being coded.
• Context dependent models provide probabilities that are based on a predefined
neighborhood of pixels—called the context—around the symbols being coded.
• Figure (a) diagrams the steps involved in adaptive, context-dependent
• arithmetic coding of binary source symbols. Arithmetic coding often is used
• when binary symbols are to be coded.
• As each symbol (or bit) begins the coding process, its context is formed in the
Context determination block of (a).
• Figures (b) through (d) show three possible contexts that can be used:
• (1) the immediately preceding symbol,
• (2) a group of preceding symbols, and
• (3) some number of preceding symbols plus symbols on the previous scan
line.For the three cases shown, the Probability estimation block must manage
• (or 2), (or 256), and (or 32) contexts and their associated probabilities.
• For instance, if the context in Fig.(b) is used, conditional probabilities
• P(0/a = 0) the probability that the symbol being coded is a 0 given that
the preceeding symbol is a 0), and must be tracked.
• The appropriate probabilities are then passed to the Arithmetic coding
• block as a function of the current context and drive the generation of
the arithmetically coded output sequence in accordance with the
process illustrated in
• Fig.The probabilities associated with the context involved in the current
coding step are then updated to reflect the fact that another symbol
within that context has been processed.
LZW Coding
• The techniques covered in the previous sections are focused on the removal of coding
redundancy. consider an error-free compression approach that also addresses spatial
redundancies in an image.
• The technique, called Lempel-Ziv-Welch (LZW) coding, assigns fixed-length code words to
• variable length sequences of source symbols. That Shannon used the idea of coding
sequences of source symbols, rather than in dividual source symbols, in the proof of his
first theorem.
• A key feature of LZW coding is that it requires no a priori knowledge of the probability of
occurrence of the symbols to be encoded. Despite the fact that until recently it was
protected under a United States patent, LZW compression has been integrated into a
variety of mainstream imaging file formats, including GIF, TIFF, and PDF.
• The PNG format was created to get around LZW licensing requirements.
• Consider again the 8-bit image from Fig. 8.9(a). Using AdobePhotoshop,
an uncompressed TIFF version of this image requires 286,740
• bytes of disk space—262,144 bytes for the 8-bit pixels plus 24,596
• bytes of overhead. Using TIFF’s LZW compression option, however, the re-
• sulting file is 224,420 bytes. The compression ratio is Recall that for
• the Huffman encoded representation of Fig. 8.9(a) in Example 8.4,
• The additional compression realized by the LZW approach is due the
removal
• of some of the image’s spatial redundancy.
• LZW coding is conceptually very simple (Welch [1984]). At the onset of the
• coding process, a codebook or dictionary containing the source symbols to be
• coded is constructed.
• For 8-bit monochrome images, the first 256 words of the dictionary are
assigned to intensities 0, 1, 2, 255. As the encoder sequentially examines image
pixels, intensity sequences that are not in the dictionary are placed in
algorithmically determined (e.g., the next unused) locations.
• If the first two pixels of the image are white, for instance, sequence “255–255”
• might be assigned to location 256, the address following the locations reserved
• for intensity levels 0 through 255.
• pixels are encountered, code word 256, the address of the location
containingsequence 255–255, is used to represent them. If a 9-bit,
512-word dictionary isemployed in the coding process, the original
bits that were used to rep-resent the two pixels are replaced by a
single 9-bit code word.
• Clearly, the size of the dictionary is an important system parameter. If
it is too small, the detection of matching intensity-level sequences will
be less likely; if it is too large,the size of the code words will adversely
affect compression performance.
Bit Plane coding
Module -3:Part B
An Introduction to Mathematical tool in Digital
Image Processing
Set and Logical Operations :Basic Set
Operations
Logical operations
Vector and Matrix Operation
• Multispectral image processing is a typical area in which vector and
matrix operations are used routinely.
• each pixel of an RGB image has three components, which can be
organized in the form of a column vector.

where is the intensity of the pixel in the red image, and the other two elements are the corresponding
pixel intensities in the green and blue images, respectively. Thus an RGB color image of size can be
represented by three component images of this size, or by a total of MN 3-D vectors.
• Once pixels have been represented as vectors we have at our disposal
the tools of vector-matrix theory.
• For example, the Euclidean distance, D, between a pixel vector z and
an arbitrary point a in n-dimensional space is defined as the vector
product.

You might also like