Course Projects

Due: April 15, 2008 by 5 pm

Please email (.pdf or .ps only, no MS-Word) to csc2535prof@cs Worth: 40%

General Guidelines

The idea of the final project is to give you some experience trying to do a piece of original research in machine learning, or studying a specific algorithm in depth, and writing up your results in a paper style format. What we expect to see is a simple but original idea/task that you describe clearly, relate to existing work, implement and test on a small scale problem. To do this you will need to write code, run it on some data, make some figures, read a few background papers, collect some references, and write a few pages describing your task, the algorithm(s) you used and the results you obtained. As a rough rule of thumb, spend about as much time doing the work as you would have studying for an exam and a few hours writing it up after that (instead of actually writing an exam). Projects can be done individually, or in pairs. Of course, the expectations will be higher for pair projects.

Specific Requirements

Your project must implement one or more machine learning algorithms and apply them to some data. Your project may be a comparison of several existing algorithms, or it may propose a new algorithm in which case it still must compare it to at least one other approach. Your submission must include at least two figures which graphically illustrate quantitative aspects of your results, such as training/testing error curves, learned parameters, algorithm outputs, input data sorted by results in some way, etc. Your submission must include at least 5 references to previously published papers or book sections. Your submission should follow the generally accepted style of paper writing: include an introduction section to motivate your problem and algorithm, a section describing your approach and how it compares to previous work, a section outlining the experiments you ran and the results you obtained, and a short conclusions section to sum up what you discovered. Your submission must be prepared in the NIPS 2006 paper style (using Latex is encouraged but not required), and must be no longer than 5 pages in length (10 pages for pair projects), including all figures, tables, references, etc. Do not hand in any code of any kind.
**Note: You must turn in a brief project proposal (1-2 paragraphs) in class on March 5th. Your project proposal should describe the idea behind what you plan to pursue. You should also briefly describe software you will need to write, and papers (1-2) you plan to read. Please also say if you will have a partner, and if so, who it will be.**

Marking Scheme

The projects will be marked out of 40, with each point being worth 1% of your grade. The following criteria will be taken into account when marking: 1. Clarity/Relevance of problem statement and description of approach. 2. Discussion of relationship to previous work and references. 3. Design and execution of experiments. 4. Figures/Tables/Writing: easily readable, properly labeled, informative.

Friendly Advice

  • Be selective! Don't choose a project that has nothing to do with machine learning. Don't investigate an algorithm that is clearly doomed to failure or un-implementable. Don't attack a problem that is irrelevant, ill-defined or unsolvable.
  • Be honest! You are not being marked on how good the results are. It doesn't matter one bit if your method is better or worse than the ones you compare to. What matters is that you try something sensible, clearly describe the problem, your method, what you did, and what the results were.
  • Be modest! Don't pick a project that is way too hard. Usually, if you select the simplest thing you can think of to try, and do it carefully, it will take much longer than you think.
  • Be careful! Don't do foolish things like test on your training data, set parameters by cheating, compare unfairly against other methods, include plots with unlabeled axes, use undefined symbols in equations, etc. Do sensible cross-checks like running your algorithms several times, leaving out small parts of your data, adding a few noisy points, etc. to make sure everything still works reasonably well. Make lots of pictures along the way.
  • Learn! The point of the project is to give you a chance to "test drive" the process of writing a paper, which many of you have never done, in a low-stress setting, away from the pressures of your thesis and conference deadlines. Consider this an opportunity to learn how to write code to run large experiments, make nice figures, layout readable equations, describe your work concisely to a smart but uninitiated reader, etc.
  • Have fun! If you pick something you think is cool, that will make getting it to work less painful and writing up your results less boring.