Assignment 4: Bonus
Note: No part of this is required for the project, but it is a fun way to earn bonus points. And you might be able to get bonus marks by simply submitting your existing work! See below...
Leaderboard URL
Details
Remember the binary you were given for the main portion of this project? We've taken the eight dictionaries and put them into a standardized binary with one particular order of mys01, mys02, ..., mys08 flags.
You still need to solve your own binary, but you can also play with our binary through looking at the leaderboards. You just submit a directory of C files using svn, located at group_XXXX/a4/bonus/, and we'll run your programs for you and process its output. (Note you'll have to create the bonus directory in a4.)
Note that your program must behave exactly like this: It takes a single command-line argument which is a positive integer. It then generates precisely that many lines of output, each of which is a valid input to the dictionary executable (one of the "I", "R", or "F" commands). (Sadly, the server can only process C files -- no Python!)
Then we tabulate the results, do some curve analysis, generate charts, and tell you which group has identified the most number of dictionaries, and display this on the leaderboard.
Assignment Specifications
Unlike the required part of the assignment where any type of abstract reasoning is okay, this portion of the assignment is marked by a computer. As such, all logic for this assignment is in the design of your C programs.
And instead of identifying dictionaries, you are writing programs specifically designed to differentiate them. For example, what program could you write to generate inputs that would differentiate a binary search tree from an AVL tree?
Furthermore, the runtimes on these processed inputs need to correspond one of the curves n, n2, or n lg n. (Note: if each operation takes constant time, the total runtime of n operations will be O(n). Similarly, if each operation takes time proportional to the number of operations already performed, the total runtime of n operations will be O(n2). If you don't understand these sentences, get someone to help, because they matter in an exam-y sort of sense.) Whether or not you have differentiated two dictionaries depends on how tightly your generator input's runtimes match two different curves for two different dictionaries.
For example, say you write a program that generates n input commands to a dictionary. When run on dictionary mys01 with different n values, it very closely matches an n curve with respect to processing time. When run on mys02, it very closely matches an n lg n curve. When run on mys01, it doesn't match any curve very well. We can say that your program successfully differentiates mys01 and mys02, but doesn't say anything in particular about mys03.
Submission Constraints
- Only files ending in ".c" will be processed, and those files must compile and run under gcc on a Linux machine. If you are not sure this is the case (for example, you are developing on a Windows machine), feel free to copy your .c file to the ECF servers and try compiling it.
- You can only submit up to 10 programs, but you can submit as many times as you like.
- When submitting, please be sure to wait for the leaderboard to update. It will take about five minutes at the longest to run.
Dora's Happy Hour
If your average confidence of dictionary differentiation (see the last column in the user rankings) is in these ranges, we'll give this many bonus points:
- [70%, 85%): 1 bonus point
- [85%, 95%): 2 bonus points
- [95%, 100%]: 4 bonus points
Despite the fact that you need to design your C files slightly differently from the required portion of this programming project, you can still submit the files you have already written and possibly already get some bonus points. Try it out!
Summary
- The mapping of dictionaries mys01, mys02, ..., mys08 is different from your mapping in the required part of the assignment.
- You need to design C files that generate input commands to the dictionaries, such that their curves closely match one of n (actually, any linear function of n), n2 (or any quadratic polynomial in n), or n lg n (or anything O(n). So if you wanted to match an n curve, when your input generator is run with n commands, it should write commands so that it'll take (for example) 2n seconds to process.
- Warning: What we're looking for can be tricky to produce. For example, say you wanted to try to match an n2 curve for mys03. We might give you the number 800000 as input, and your program must produce a single 800000-line output that, when run through the dictionary executable, gives you data points matching an n2 curve. It is not just the overall runtime on the 800000 lines of input that matters, but the shape of the intermediate points. In some cases, we're not even sure whether it's possible to differentiate the dictionaries in this way. Surprise us!