Fundamentals of Multimedia
Fundamentals of Multimedia
9 Exercises 195
• An excellent resource for data compression compiled by Mark Nelson that includes
libraries, documentations, and source code for Huffman Coding, Adaptive Huffman
Coding, LZW, Arithmetic Coding, and so on.
• The Theory of Data Compression web page, which introduces basic theories behind
both lossless and lossy data compression. Shannon's original 1948 paper on infor-
mation theory can be downloaded from this site as well.
• The FAQ for the comp. compression and comp. compression. research
groups. This FAQ answers most of the commonly asked questions about data com-
pression in general.
• A set of applets for lossless compression that effectively show interactive demonstra-
tions of Adaptive Huffman, LZW, and so on. (Impressively, this web page is the fruit
of a student's final project in a third-year undergraduate multimedia course based on
the material in this text.)
7.9 EXERCISES
1. Suppose eight characters have a distribution A:(1), B:(l), C:(l), D:(2), E:(3), F:(5),
0:(5), H:(lO). Draw a Huffman tree for this distribution. (Because the algorithm may
group subtrees with equal probability in a different order, your answer is not strictly
unique.)
2. (a) What is the entropy (I]) of the image below, where numbers (0, 20, 50, 99)
denote the gray-level intensities?
99 99 99 99 99 99 99 99
20 20 20 20 20 20 20 20
0 0 0 0 0 0 0 0
0 0 50 50 50 50 a 0
0 0 50 50 50 50 0 0
0 0 50 50 50 50 0 0
0 0 50 50 50 50 0 0
0 p 0 0 0 0 0 0
(b) Show step by step how to construct the Huffman tree to encode the above four
intensity values in this image. Show the resulting code for each intensity value.
(c) What is the average number of bits needed for each pixel, using your Huffman
code? How does it compare to I]?
196 Chapter 7 Lossless Compression Algorithms
3. Consider an alphabet with two symbols A, E, with probability peA) = x and PCB) =
I-x.
(a) Plot the entropy as a function of x. You might want to use log 2 (3) = 1.6,
log2 (7) = 2.8.
(b) Discuss why it must be the case that if the probability of the two symbols is
1/2 + E and 1/2 - E, with small E, the entropy is less than the maximum.
(c) Generalize the above result by showing that, for a source generating N symbols,
the entropy is maximum when the symbols are all equiprobable.
(d) As a small programming project, write code to verify the conclusions above.
4. Extended Huffman Coding assigns one codeword to each group of k symbols. Why is
average(l) (the average number of bits for each symbol) still no less than the entropy
y/ as indicated in equation (7.7)?
S. (a) What are the advantages and disadvantages of Arithmetic Coding as compared
to Huffman Coding?
(b) Suppose the alphabet is [A, B, el, and the known probability distribution is
Pit = 0.5, PB = 0.4, Pc = 0.1. For simplicity, let's also assume that both
encoder and decoder know that the length of the messages is always 3, so there
is no need for a terminator.
i. How many bits are needed to encode the message BBB by Huffman
coding?
ii. How many bits are needed to encode the message BBB by arithmetic
coding?
6. (a) What are the advantages of Adaptive Huffman Coding compared to the original
Huffman Coding algorithm?
o 1
o 1
NEW b
(b) Assume that Adaptive Huffman Coding is used to code an information source S
with a vocabulary of four letters (a, b, c, d). Before any transmission, the initial
coding is a;: 00, b ;: Ol,'c;: 10, d;: 11. As in the example illustrated in Figure
7.7, a special symbol NEW will be sent before any letter if it is to be sent the
first time.
Figure 7.11 is the Adaptive Huffman tree after sending lellers aabb. After
that, the additional bitstream received by the decoder for the next few letters is
01010010101.
7. Compare the rate of adaptation of adaptive Huffman coding and adaptive arithmetic
coding (see the textbbook web site for the latter). What prevents each method from
adapting to quick changes in source statistics?
8. Consider the dictionary-based LZW compression algorithm. Suppose the alphabet is
the set of symbols {O, 1}. Show the dictionary (symbol sets plus associated codes)
and output for LZW compression of the input
o 110 0 1 1
9. Implement Huffman coding, adaptive Huffman, arithmetic coding, and the LZW
coding algorithms using your favorite programming language. Generate at least three
types of statistically different artificial data sources to test your implementation of
these algOlithms. Compare and comment on each algorithm's performance in terms
of compression ratio for each type of data source.
7.10 REFERENCES
1 M. Nelson, The Data Compression Book, 2nd ed., New York: M&T Books, 1995.
2 K. Sayood, Introduction to Data Compression, 2nd ed" San Francisco: Morgan Kaufmann,
2000.
3 C.E. Shannon, "A Mathematical Theory of Communication;' Bell System Technical Journal,
27: 379-423 and 623-656,1948.
4 C.E. Shannon and W. Weaver, The Mathematical Theol}' oj Communication, Champaign, ll.:
University of Illinois Press, 1949.
5 R.C. Gonzalez and R.E. Woods, Digital Image Processing, 2nd ed., Upper Saddle River, NJ:
Prentice Hall, 2002.
6 R. Fano, Transmission ojInjomration, Cambridge, MA: IvIiT Press, 1961.
7 D.A. Huffman, "A Methodfor the Construction ofMinimum-Redundancy Codes;' Proceedings
ojthe IRE [Institute of Radio Engineers, now the IEEE], 40(9): 1098-1101, 1952.
8 T.H. Carmen, C.E. Leiserson, and R.L. Rivest, Introduction to Algorithms, Cambridge, MA:
MIT Press, 1992.