Fast Lempel-ZIV (LZ78) Algorithm Using Codebook Hashing
Figure 1: The Lempel Ziv Algorithm Family [6].
A. Test Case #1
A. Principal Test Case #1 uses a string of 1,00,000 characters as input to
The LZ78 is a dictionary-based compression algorithm that LZ78 encoder. The following sub-sections illustrate the
maintains an explicit dictionary. The codewords output by the result of test case #1 using LZ78 with and without hashing.
algorithm consist of two elements: an index referring to the The GUI used is generated using Visual Studio 2012.
longest matching dictionary entry and the first non-matching
symbol.In addition to outputting the codeword for
storage/transmission, the algorithm also adds the index and
symbol pair to the dictionary. When a symbol that not yet in
the dictionary is encountered, the codeword has the index
value 0 and it is added to the dictionary as well. With this
method, the algorithm gradually builds up a dictionary [10].
This simplified pseudo-code version of the algorithm does not
prevent the dictionary from growing forever. There are
various solutions to limit dictionary size, the easiest being to
stop adding entries and continue like a static dictionary coder
or to throw the dictionary away and start from scratch after a
certain number of entries has been reached.
Fig. 2: Test Case #1 LZ78 Coding (Without Hashing)
The Encode button will encode the entered text using LZ78 Table 1.1: Compression Ratio comparison for different
with Hashing (as the Use Hashing checkbox is checked). message length.
Rest all the process of encoding and decoding is same as Input
explained in the previous section with LZ78 without hashing. Binary
Code Book Compression
Message Entries Ratio
B. Test Case #2 Length
Unlike the previous test case the Test Case #2 uses a string of
2,00,000 characters as input to LZ78 encoder and performs 801,096 611,609 39,833 76.35 %
encoding with and without hashing. The following two figures
show the test results. 1,602,200 1,170,977 72,337 73.09 %
Fig. 4: Test Case #2 LZ78 Coding (Without Hashing) 2,402,448 1.562 seconds 319.818 seconds
In this paper we presented a source coding scheme that we call
Hashed Lempel-Ziv coding, as an extension for the LZ78
Fig. 5: Test Case #2 LZ78 Coding (With Hashing)
coding scheme, without sacrificing the coding efficiency and
the compression ratio attained by the original LZ78
Fast Lempel-ZIV (LZ78) Algorithm Using Codebook Hashing
