CSC 310 Fall 2007 — Project

Due: November 26rd. See course webpage for up-to-date information on due date.
Worth: 20% of your course grade.


Part 1

Using either C, C++, Java or Matlab, write an implementation of Arithmetic Coding using integer-based operations (not floating-point). Many of the details needed for this implementation can be found in lecture notes 8-10. Here are some of the assumptions and constraints you may work with:

Using your program, compress the following files: [ file 1 ], [ file 2 ], [ file 3 ], [ file 4 ], [ file 5 ]. Decompress them and make sure the output of your decompressor matches the original file.

For each of these files, report the size of the original file in bytes, the size of the encoding in bits, the average number of bits per symbol and the compression ratio achieved (for this, make sure you use the same unit, bits or bytes).

Moreover, using a separate program, compute the counts for each input symbol in each input file. Compute the entropy of the set of probabilities you observe. Note, for this step, there is no need to adjust the zero counts as in the encoding part. Compare this entropy with the average number of bits per symbol your encoder achieves.

Repeat this analysis with one more input file: the encoding of any one of the files above. That is, try to compress the same thing twice and see how it goes.

What to hand in:


Part 2

Using either C, C++, Java or Matlab, write an implementation for a coder and a decoder for the [15,11] Hamming code given by the matrices below. For this part of the project, you might find Matlab more useful, as it contains built-in operations for dealing matrices. Details: Along with your channel encoder and decoder, use the following program to simulate a BSC with a given flip probability: [ bsc.c ]. Compile it and use it as follows:
bsc 0.2 <file.in >file.out
Furthermore, use the following program to count the number of blocks you get right and wrong: [ block-diff.c ]. Compile it and use it as follows:
block-diff file-before-encoding file-after-decoding

Use your channel encoder on the following 2 channel input files: [ file 5 ] (also used before), and its arithmetic coding (that is, pass it through your arithmetic coder, and apply the channel coder to the file you get). If you did not finish your Arithmetic coder (and only then), use [ file 4 ] as your second example for channel coding. At this point, you should have 2 channel encoding files.

Take the 2 channel encodings and simulate their transmission through the BSC once with flip probability 0.05 and once with 0.2. At this point you should have 4 channel transmission files. Clarification: Do not apply the BSC twice in a row! Apply it to the first file with flip probability 0.05, save the result, then apply it again to the first file with flip probability 0.2, and save the result. Repeat this for the second channel encoding file.

Pass each of the 4 channel transmission files through your channel decoder. At this point, you should have 4 channel decoding files.

Compare each of your channel decodings with their respective channel input file, using the provided block-based comparison program.

What to hand in: