University of Toronto - Spring 2001
Department of Computer Science

Assignment 1 Announcements

Thursday 18, January: Encoding '|'

Question: How does the encoding file represent an encoding for '|'?
Answer: Using the same rules as any other character. Here is an example that gives the encoding for `|' as 001.

111|a1|P01||001|x000


Thursday 18 January: New-line Characters at the end of encoding.txt

Question: It seems that there are new-line characters /n at the end of both encoding.txt and encoding.txt. These aren't mentioned in the handout. Are they supposed to be there? How do I handle this?
Answer: When you create a text file with a typical text editor, it automatically puts and new-line character at the end of the last line. So when you create your own test files they will have this /n character after the last line as well. test1.txt doesn't have this final new-line character but it wasn't created using a text editor.

This means that you must account for this character when you are reading in the encoding. Notice that this new-line occurs when you are expecting either a "|" and "0" or a "1" character. If you read a new-line character at this point you know you have reached the end of the table.


Thursday 18 January: Safe Delimiters

Clarification: Regarding the announcement posted yesterday about ending the table with a delimiter. A good choice of delimiter is to use \0 which is the character c uses to delimit strings. The binary representation of \0 is 00000000.
Yes we realize that if you assume that your input file doesn't contain \0 (or other non-printable characters) then you are certain not to need all 256 entries in your array of ccharacter/code pairs if your sizeof(char) is 1. It's ok if your array is not efficient use of storage. This isn't the point of this assignment and we don't want you to spend time thinking about this particular issue.
Wednesday Jan 17: Ending the Table

Question: In my compressed file, how should I mark where the encoding table ends and my compressed file begins?
Answer: There are a number of ways to do this. You should be able to think of at least two based on the lectures and text readings. You are free to implement this any way you like but the easiest way is to pick something to use as a delimiter. Since you were told that the files to be compressed were text files, pick your delimiter so that it can't occure in the original file. Use a char whose ASCII code doesn't have a meaningful representation in a text file.

Monday 15 January: New Versions of Files

Announcement: On the weekend I posted eroneous versions of test1.txt and encoding2.txt Please make sure that you have the current versions which were posted on Monday morning at 11:50.
The problem with the older versions was that the encoding files did not provide codes for all the characters found in the test files. Specifically both test files contained \n characters which are typically understood to represent a carriage return or a new line. Notice that when you look at the new encoding2.txt file it appears to be missing one of the characters and takes 2 lines. This is because one of the characters to be encoded is \n. If you are confused try looking at encoding2.txt using od -bc .

Saturday 13 January: Testing

Question: Are those two test files you provided on the web-site sufficient testing?
Answer: Absolutely not. Please read the section in the course guide about testing and showing your testing in your report.

Saturday 13 January: How many codes?

Question: How big does my table of code/character pairs have to be? Do I have to be efficient or be able to hold any number of pairs?
Answer: You can assume that there will be no more than 256 different characters in the file you need to encode. It is fine to use an array to hold the table even if many of the array elements are unused for the sample test cases we provided.

Tuesday 9 January:

Question: Do we have to follow the advice from advice.ps?
Answer: No. You must meet all the specifications in the assignment handout but the advice handout is just suggestions. You are not required to follow this advice but we think it might help.