Assignment11 HuffmanCode
Assignment11 HuffmanCode
Coding Algorithm
This assignment is not easy. It requires you understand the related data structures, the tree
traversal algorithms, Huffman coding Algorithm, bitwise operation and file operations. You
should read through the assignment to get a basic idea of the requirements and then start working
through the assignment in the recommended order.
Objectives:
The main objectives of this assignment are:
o Implement Binary tree.
o Use Heap Priority Queue from Assignment 3
o Compress and decode the data file
The Assignment:
The Concept:
This project will use various data structures to implement Huffman code
compression and expand algorithm. It has three parts: implement the linked binary
tree, compress the data file, and decode/expand the compressed data file.
Your Job:
1. First you need to implement LinkedBinaryTree.java data structure. Class files
LinkedBinaryTree.java have been provided for you. You should NOT change
the class names, interfaces, or packages in provided class files.
2. Second you need to implement the compress function in Huffman.java to
compress a given input data file. The input data file is a text file with ASCII code,
each character in the file takes one byte. Your program will read the data file,
generate the Huffman code tree, and then create the compressed data file, which
contains the bit representation of the prefix followed by the size of the original
file, then Huffman code for the original input data file. See the example of
compressed file in the FAQ.
3. Third you need to implement the decode function in Huffman.java to decode a
given compressed data file. The beginning part of the compressed file represents
the prefix coding tree.
Notes
A few data files, and pre-compressed files for them have been packaged in the
project. You may use them as your test cases. Please see the main function for details.
Getting Started:
1. Download and import the following zip file: Assignment11.zip. For Eclipse:
1. Open Eclipse
2. Choose "File → Import"
3. Select the "General" category
4. Pick "Existing Projects into Workspace"
5. Click on "Next"
6. Select the button for "Select Archive File" and browse to/select the zip file.
7. Click on "Finish"
2. Copy the Heap PQ implementation from Assignment 3
Submission:
First, look at the following checklist:
1. Does the program compile on CS machines? (Programs that don't compile for us
will not be graded)
2. Do you ever import from java.util. If so, be sure you only import allowed
components (like Iterator, Exceptions, etc.). Unless the assignment specifically
mentions it is permissible, you should never include any of java's native data
structures.
3. Does the program meet all required interfaces?
4. Is the indentation easily readable? You can have Eclipse correct indentation by
highlighting all code and select "Source → Correct Indentation".
5. Have you removed any of the "cs2321" comments (/*#) that you may have
accidentally copy/pasted?
6. Are comments well organized and concise?
7. Do you want the bonus points? Then submit the report pdf file. See Your Job #4
for description.
If you have reviewed your code and all the criteria on the checklist are acceptable, follow
the submission procedure.
Grading Criteria:
An LinkedBinaryTree correctly implement all methods and interfaces: 30
The Compress Function works correctly: 40
The Decode Function works correctly: 20
Clear concise code with good commenting: 10
Bonus points: Compression rate report: 20 points.
FAQ
1. What does the compressed file look like?
Example: file ab.txt has two characters in it: “ab”
The Huffman tree will be either
*
/ \
a b
or
*
/ \
b a
code table:
a: 1
b: 0
The compressed file will have 7 bytes. The first 19 bits represents the prefix tree, the next 32
bites represents the length, the next 2 bits is the data, the last 2 bits are padding.
Padding bits, 0 or 1.
Internal node
Huffman code for b
$ xxd -b ab.txt.compressed
00000000: 10011000 10001100 00100000 00000000 00000000 00000000 .. ...
00000006: 01010000