Programming Assignment #5 (Extra credit)
5 points
If you do the extra credit option, you are
still required to complete the standard HuffmanTree assignment (namely homework
#5). So if you work on this, do so only
after you have completed and submitted the standard assignment.
To keep things clear, for this part of the assignment you should create
a class called HuffmanTree2. You can copy your HuffmanTree class and modify
it appropriately to get the initial version of this class.
The main goal of this variation is to eliminate
the code file. When you use a utility like
zip, you don’t expect it to produce two output files (a code file and a binary
file). You expect it to produce one file.
That’s what we’ll do in this variation.
To do so, we’ll have to be able to include information in the binary file
about the tree and its structure.
In the original version we had three programs:
MakeCode, Encode and Decode. For this version
there are two main programs: Encode2
and Decode2.
In all, you will have to include the following three new methods
in your class along with the other methods we had in HuffmanTree:
Method | Description |
HuffmanTree2(BitInputStream input) | Constructs a Huffman tree from the given input stream. Assumes that the standard bit representation has been used for the tree. |
void assign(String[] codes) | Assigns codes for each character of the tree. Assumes the array has null values before the method is called. Fills in a String for each character in the tree indicating its code. |
void writeHeader(BitOutputStream output) | Writes the current tree to the output stream using the standard bit representation. |
In the original HuffmanTree we had a method
called write that would write the codes to an output file. Here the Encode2 program does the actual encoding.
It first reads the file and computes the frequencies.
Then it calls your constructor to create an appropriate HuffmanTree. It has to have some way to find out what codes
your tree has come up with so that it can encode the characters of the file.
It does so by calling the assign method in your class passing it an array
of Strings that are all null. Your method will replace the null’s with codes
for the characters included in the tree.
The Encode2 program also calls the method
writeHeader in your class. The idea is
to write to the bit stream a representation of the tree that can be used to reconstruct
it later. We can print the tree using a
preorder traversal. For a branch node, we write a 1 indicating that
it is a branch. We don’t need to write
anything more, because the branch nodes contain no data. For a leaf node, we will write a 0. Then we need to write the ASCII value of the
character stored at this leaf. There are
many ways to do this. We basically need
to write some bits that can be read later to reconstruct the character value.
The value will require up to 9 bits to write (it would be 8 if it weren’t
for our pseudo-eof character).
We need to decide on a convention for writing
an integer in 9 bits that we can reverse later when we read it back in.
Below are the two methods we would like you to use to do this.
They are inverses of each other in that read9 will recreate what write9
writes to the stream:
// pre : 0 <= n < 512
// post: writes a 9-bit representation of n to the
given output stream
private void write9(BitOutputStream output, int n)
{
for (int
i = 0; i < 9; i++) {
output.writeBit(n
% 2);
n /=
2;
}
}
// pre : an integer n has been encoded using write9
or its equivalent
// post: reads 9 bits to reconstruct the original integer
private int read9(BitInputStream input) {
int multiplier
= 1;
int sum =
0;
for (int
i = 0; i < 9; i++) {
sum +=
multiplier * input.readBit();
multiplier
*= 2;
}
return sum;
}
Encode2 produces a binary file that first
has a header with information about the tree and then has the individual codes
for the characters of the file. The Decode2
program has to use this information to reconstruct the original file.
It begins by calling the constructor listed in the table above, asking
your class to read the header information and reconstruct the tree.
Once the tree has been reconstructed, the program calls your decode method
from the original assignment to reproduce the original file.
To check the correctness of your program,
here are the two encoded files that you should get when you run your Encode program
on the two text files short and hamlet: short.bonus
and hamlet.bonus. Encode2 should produce exactly the same output
when used with your version of HuffmanTree2. Use the separate turn-in for the extra credit.
Turn in all of the java files that you wrote namely HuffmanNode.java (if not included in HuffmanTree2) and
HuffmanTree2.java.