60629 -- Project

Study plan Due Date: October 28, 2022
Final report Due Date: December 15, 2022 (by the end of the day)
Handing in: Through ZoneCours


This project will be worth 30% of your final grade. You must work in teams of two or three.

Grading Scheme:

Clarity/Relevance of problem statement and description of approach. (10)
Discussion of relationship to previous work and references.(5)
Design and execution of experiments.(10)
Figures/Tables/Writing: easily readable, properly labeled, informative. (5)

Total (30)


Todo: The aim of this project is to allow you to learn about machine learning by trying to solve a task with it.

First, select a question that can be answered using machine learning. I expect that your question will be about a model/algorithm or about an application. Then design a study that will try to answer your question. Your study must have an element of novelty. For example the novelty could be an extension or a variation of an existing algorithm or results of an existing method on a new dataset.

Your study should involve reading and understanding some background material. Your study must involve running some experiments. You are free to use (or not) any of the tools or models we have seen in class.

Alternatively: You could decide to participate in this open challenge: ML Reproducibility Challenge 2022. Let me know as soon as possible if you are interested in this.

Study plan: Please send me a one-page summary of your proposed research question and study. I will meet with each group to discuss study plans during the lecture of October 31 (English) or November 2 (French). I will send you a schedule the day before. We will probably only have about 15 minutes so please make sure that your study plan is clear and precise. You may also include questions that you would like us to discuss at the end of the document.

The report: Your report must contain a description of the question you are trying to answer, a clear description of the model/algorithm you are studying, a survey of related work which proper references, an empirical section that reports your results, and a conclusion that summarizes your findings and (if pertinent) highlights possible future directions of investigation. Your report should be no longer than 10 pages in length (plus references) for pairs or 13 pages (plus references) for teams of three.

Some advice (mostly taken from csc2515 at UofT):

  • Be selective! Don't choose a project that has nothing to do with machine learning. Don't investigate an algorithm that has a high chance of failing or being un-implementable. Don't attack a problem that is irrelevant, ill-defined or unsolvable. Spend most of your time doing machine learning and not related things such as pre-processing your data.
  • Be honest! You are not being marked on how good the results are. It doesn't matter if your method is worse than the ones you compare to provided you implemented it properly. What matters is that you try something sensible and clearly describe the problem, your method, what you did, and what the results were.
  • Be modest! Don't pick a project that is way too hard. Usually, if you select the simplest thing you can think of to try, and do it carefully, it will take much longer than you think.
  • Be careful! Don't do foolish things like test on your training data, set parameters by cheating, compare unfairly against other methods, include plots with unlabeled axes, use undefined symbols in equations, etc. Do sensible cross-checks like running your algorithms several times, leaving out small parts of your data, adding a few noisy points, etc. to make sure everything still works reasonably well. Make lots of pictures along the way.
  • Learn! The point of the project is to give you a chance to "test drive" the process of doing machine learning. Consider this an opportunity to learn how to write code to run large experiments, make nice figures, layout readable equations, describe your work concisely to a smart but uninitiated reader, etc.
  • Have fun! If you pick something you think is cool, that will make getting it to work less painful and writing up your results less boring.