Machine learning is a set of techniques that allow machines to learn from data and experience, rather than requiring humans to specify the desired behavior by hand. Over the past two decades, machine learning techniques have become increasingly central both in AI as an academic field, and in the technology industry. This course provides a broad introduction to some of the most commonly used ML algorithms.
The first half of the course focuses on supervised learning. We begin with nearest neighbours, decision trees, and ensembles. Then we introduce parametric models, including linear regression, logistic and softmax regression, and neural networks. We then move on to unsupervised learning, focusing in particular on probabilistic models, but also principal components analysis and K-means. Finally, we cover the basics of reinforcement learning.
There are two sections of the course. Tutorials will be held in the main lecture room.
Lecture Time | Tutorial Time | Lecture/Tutorial Room | Start | End | |
Section 1 | Wednesday 10am-noon | Wednesday noon-1pm | Bahen 1190 | Sept. 11 | Nov. 27 |
Section 2 | Thursday 2-4pm | Thursday 4-5pm | Bahen 1180 | Sept. 12 | Nov. 28 |
Prerequisites (an undergrad course in each is sufficient):
Marking Scheme.
Collaboration policy. You are expected to work on the homeworks by yourself. You should not discuss them with anyone except the TAs or the instructor.
Academic Integrity. By this point in your studies, you've heard this lots of times, so we'll keep it brief: avoid academic offenses (i.e. cheating). All graded work in this course is individual work.
Lateness. Homeworks will be accepted up to 3 days late, but 10% will be deducted for each day late, rounded up to the nearest day.
Remarks. Remark requests for homeworks should be made through MarkUs, and will be considered by the same TA who marked the assignment. The deadline for requesting a remark is typically one week after the marked assignments are returned. Remark requests for exams will be handled by the instructor; details to be announced later.
Exceptions. Exceptions to the course policies such as late homeworks or missed tests require permission of the instructor. For medical excuses, you should obtain an official Student Medical Certificate.
Auditing. If you are a U of T student, then you may audit the course (i.e. sit in on the lectures) only if there are empty seats available after everyone enrolled in the course has been seated. Anyone else (i.e. non-students) is not permitted to audit; this is a University policy. No University resources will be committed to the auditor, i.e. we won't mark homeworks or exams.
Most homeworks will be due on Thursdays at 11:59pm. You will submit through MarkUs; directions are given in the assignment handouts.
Out | Due | Materials | TA Office Hours | |
Homework 1 | 9/13 | 9/26 |
[Handout] [clean_real.txt] [clean_fake.txt] [clean_script.py] |
Fri 9/20, 12-1pm, in BA3201 Mon 9/23, 11am-noon, in BA3201 Wed 9/25, 2-4pm, in BA3201 Thu 9/26, 11am-noon, in BA3201 |
Homework 2 | 9/25 | 10/10 |
[Handout] [q2.py] |
Fri 10/4, 12-1pm, in BA3201 Mon 10/7, 11am-noon, in BA3201 Wed 10/9, 2-4pm, in BA3201 Thu 10/10, 11am-noon, in BA3201 |
Homework 3 | 10/11 |
[Handout] |
Fri 10/18, 12-1pm, in BA3201 Mon 10/21, 11am-noon, in BA3201 Wed 10/23, 2-4pm, in BA3289 Thu 10/24, 11am-noon, in BA3289 |
|
Homework 4 | 11/1 | 11/14 |
[Handout] [Code and Data] |
Fri 11/8, 12-1pm, in BA3201 Fri 11/8, 6-7pm, in BA3201 Mon 11/11, 11am-noon, in BA3201 Mon 11/11, 2-4pm, in BA3201 Thu 11/14, 11am-noon, in BA3201 |
Homework 5 | 11/15 | 11/28 |
[Handout] |
Wed 11/20, 2-3pm, in BA3201 Mon 11/25, 11am-noon, in BA3201 Wed 11/27, 3-4pm, in BA3201 Wed 11/27, 6-7pm, in BA3201 Thu 11/28, 11am-noon, in BA3201 |
The course will have a midterm and a final exam.
The midterm will be held from 4:10pm to 5:40pm on Wednesday, Oct. 30, in the Health Sciences building, room 610. See Lecture 6 slides for more information. You might find the following practice exams helpful:
There will be office hours for mid-term according to the following schedule: Fri 10/25, 12-1pm, in BA3201 Fri 10/25, 6-7pm, in BA3201 Mon 10/28, 11am-noon, in BA3201 Tue 10/29, 2-4pm, in BA3201 Wed 10/30, noon-1pm, in BA1190 (lecture room)
Here are the midterm questions and solutions.
The final exam will be held from 3pm to 6pm on Tuesday, Dec. 17, in the Banting Institute, room 131.
There will be office hours for final exam according to the following schedule: Mon 12/2, 12-1pm, in BA5287 Tue 12/3, 12-1pm, in BA3289 Wed 12/4, 11-12am, in BA3201 Thu 12/5, 5-7pm, in BA3201 Fri 12/6, 12-1pm, in BA3201 There will be no office hours the week before the exam (Dec 9-13) as TAs are away attending the NeurIPS conference. Please ask your questions on Piazza.
Here is the Fall 2018 final, for practice. (This one was a bit on the hard side. Question 4 was not covered this year.)
Here is a tentative schedule, which will likely change as the course goes on.
Suggested readings are just that: resources we recommend to help you understand the course material. They are not required, i.e. you are only responsible for the material covered in lecture.
ESL = The Elements of Statistical Learning, by Hastie, Tibshirani, and Friedman.
MacKay = Information Theory, Inference, and Learning Algorithms, by David MacKay.
Barber = Bayesian Reasoning and Machine Learning, by David Barber.
Bishop = Pattern Recognition and Machine Learning, by Chris Bishop.
Sutton and Barto = Reinforcement Learning: An Introduction, by Sutton and Barto.
Goodfellow = Deep Learning, by Goodfellow, Bengio, and Courville.
Topic(s) | Dates | Slides | Suggested Readings | |
Lecture 1 | Introduction Nearest Neighbours |
9/11, 9/12 | [Slides] |
ESL: Chapters 1, 2.1-2.3, and 2.5 |
Lecture 2 | Decision Trees Ensembles |
9/18, 9/19 | [Slides] |
ESL: 9.2, 2.9, 8.7, 15 |
Lecture 3 | Linear Regression Linear Classifiers |
9/25, 9/26 | [Slides] |
Bishop: 3.1, 4.1, 4.3 |
Lecture 4 | Softmax Regression SVMs Boosting |
10/2, 10/3 | [Slides] |
Bishop: 7.1, 14.3 |
Lecture 5 | Neural Networks | 10/9, 10/10 | [Slides] |
Bishop: 5.1-5.3 |
Lecture 6 | Convolutional Networks | 10/16, 10/17 | [Slides] |
Course Notes: conv nets, image classification |
Lecture 7 | PCA K-Means Maximum Likelihood |
10/23, 10/24 | [Slides] | Bishop: 12.1, 9.1 |
Lecture 8 | Probabilistic Models | 10/30, 10/31 | [Slides] | Bishop: 2.1-2.3, 4.2 |
Lecture 9 | Expectation-Maximization | 11/6, 11/7 | [Slides] | Bishop: 9.2-9.4 |
Lecture 10 | Reinforcement Learning | 11/13, 11/14 | [Slides] | Sutton and Barto: 3, 4.1, 4.4, 6.1-6.5 |
Lecture 11 | Differential Privacy | 11/20, 11/21 | [Slides] | Dwork and Roth, 2014. The Algorithmic Foundations of Differential Privacy. Chapters 2, 3.1-3.5. |
Lecture 12 | Algorithmic Fairness | 11/27, 11/28 | [Slides] | Barocas, Hardt, and Narayanan. Fairness and Machine Learning. Chapters 1 and 2. Zemel et al., 2013. Learning fair representations. Louizos et al., 2015. The variational fair autoencoder. Hardt et al., 2016. Equality of opportunity in supervised learning. |
Topic | Dates | Materials | |
Tutorial 1 | NumPy review, K-Nearest-Neigbors | 9/11, 9/12 |
[ipynb] Reviews: [Linear algebra slides], [NumPy basics], [ipynb ex1], [SVD slides], [ipynb SVD], [ipynb ex2] |
Tutorial 2 | eigendecompositions, SVD, basic information theory | 9/18, 9/19 | [Slides] |
Tutorial 3 | Gradient Descent | 9/25, 9/26 |
[Slides] [Lecture ipynb] [Worksheet ipynb] [Convexity] |
Tutorial 4 | Random Forests and XGBoost | 10/2, 10/3 | [Slides] |
Tutorial 5 | Autograd + PyTorch | 10/9, 10/10 | [Autograd ipynb] [Pytorch ipynb] |
Tutorial 6 | Convnets | 10/16, 10/17 | [CNNs ipynb] |
Tutorial 7 | Mid-term Review | 10/23, 10/24 | [Slides] |
- | Mid-term, No Tutorials | 10/30, 10/31 | |
Tutorial 8 | Probabilistic models, Bayesian Inference, Pyro | 11/6, 11/7 | [ipynb] |
Tutorial 9 | Reinforcement Learning | 11/13, 11/14 | [Slides] |
Tutorial 10 | Reinforcement Learning 2 | 11/20, 11/21 | [Slides] [ipynb] |
Tutorial 11 | Final exam Review | 11/27, 11/28 | [Slides] |
The easiest option is probably to install everything yourself on your own machine.
If you don't already have python 3, install it.
We recommend some version of Anaconda (Miniconda, a nice lightweight conda, is probably your best bet). You can also install python directly if you know how.
Optionally, create a virtual environment for this class and step into it. If you have a conda distribution run the following commands:
conda create --name csc2515 source activate csc2515
Use pip
to install the required packages
pip install scipy numpy autograd matplotlib jupyter sklearn
All the required packages are already installed on the Teaching Labs machines.