CSC 311 Fall 2022: Introduction to Machine Learning

Table of Contents

Overview

Machine learning (ML) is a set of techniques that allow computers to learn from data and experience rather than requiring humans to specify the desired behaviour by hand. ML has become increasingly central both in AI as an academic field and industry. This course provides a broad introduction to some of the most commonly used ML algorithms. It also introduces vital algorithmic principles that will serve as a foundation for more advanced courses, such as CSC412/2506 (Probabilistic Learning and Reasoning) and CSC413/2516 (Neural Networks and Deep Learning).

We start with nearest neighbors, the canonical nonparametric model. We then turn to parametric models: linear regression, logistic regression, softmax regression, and neural networks. We then move on to unsupervised learning, focusing in particular on probabilistic models, but also principal components analysis and K-means. Finally, we cover the basics of reinforcement learning.

Instructors

Rahul Krishnan and Alice Gao

Lecture and Tutorial Times

As of Sep 2022, we plan to have in-person lectures, tutorials, office hours, midterm and final exam in the fall 2022 term. This may, given the COVID-19 situation, be subject to change by the university.

Please attend your assigned lecture section. We strongly encourage students to attend the tutorials although they are optional. Auditing is not allowed this term without express written permission by the instructor.

Section Lecture Tutorial
LEC0101, LEC2001 Wednesday 12 - 2pm at MC 254 by Rahul G. Krishnan Friday 12 - 1pm at MC 254.
Sept. 23: 12-1pm at Bahen 1170
LEC0201 Tuesday 3 - 5pm at NL 6 by Alice Gao Thursday 3 - 4pm at AH 400

Prerequisites

Communication

There are many ways to get in touch with us. Please follow these rules when you contact us:

Instructor Office Hours

TAs will hold office hours to help with assignments and project, as well as preparing for the midterm and the final exam. TA office hours will be posted in the respective sections.

COVID-19

Although the pandemic has diminished somewhat, all indications are that we are not yet out of the woods. The university no longer requires the use of masks on its premises but encourages it where it is impossible to maintain physical distancing, such as in classrooms and offices.

We strongly recommend that you continue to wear masks during lectures, tutorials, and office hours out of consideration for the health of others. We also strongly encourage you to get vaccine booster shots whenever possible. The instructors plan to wear masks when in close proximity with students, such as when answering questions after lectures or during office hours. However, we may take off our masks when lecturing if we are at a safe distance from students.

Marking Scheme

You must obtain a minimum grade of 40% on the final exam to pass this course.
Component % Final Grade
3 Assignments 35% (~11.67% each)
Ethics 5%
Project 10%
Midterm 20%
Final 30%
5% of your course grade comes from assignments associated with the ethics module. All of these assignments will be short, and we expect that most of you will receive full marks.
Assignment % Final Grade Marking
Pre-module survey 1% Full credit for submitting.
Class participation 0.5% You get this 0.5% automatically.
Reflections on In-Class Activity 2% Full credit for a good-faith effort.
Post-module survey 1.5% Full credit for submitting.

Course Schedule

This course consists of 12 lectures, 3 assignments, a midterm, and a final exam. See the detailed course schedule below.

Assignments

There will be 3 assignments in this course, posted below. Assignments will be due at 11:59pm on Mondays or Fridays and submitted through MarkUs.

Format: Assignments must be submitted in PDF format through MarkUs. We encourage typesetting using LaTeX, but scans of handwritten solutions are also acceptable as long as they are legible.

Late Submission Policy: Assignments will be accepted up to 3 days late, but 10% will be deducted for each day late, rounded up to the nearest day. No credit will be given for assignments submitted after 3 days.

Collaboration Policy: Collaboration on assignments is not allowed. Each student is responsible for his/her own work. Discussion of assignments should be limited to clarification of the handout itself, and should not involve any sharing of pseudocode or code or simulation results. Violation of this policy is grounds for a semester grade of F, in accordance with university regulations.

# Materials TA Office Hours
Assignment 1 [hw1.pdf]
[clean_fake.txt]
[clean_real.txt]
[clean_script.py]
Thursday, September 22, 10:00AM - 12:00PM (in person in BA2270, CS Help Centre).
Friday, September 23, 8:00AM - 10:00AM (virtual).
Friday, September 23, 5:00PM - 7:00PM (in person in BA2270, CS Help Centre).
Thursday, September 29, 10:00AM - 12:00PM (in person in BA2270, CS Help Centre).
Thursday, September 29, 10:00AM - 12:00PM (virtual).
Friday, September 30, 5:00PM - 7:00PM (in person in BA2270, CS Help Centre).
Assignment 2 [hw2.pdf]
[hw2_code.zip]
Tuesday, October 4, 2022, 5:00PM - 7:00PM (in person in BA2270, CS Help Centre).
Thursday, October 6, 2022, 8:00AM - 10:00AM (virtual).
Friday, October 7, 2022, 11:00AM - 12:00PM (in person in BA2270, CS Help Centre).
Tuesday, October 11, 2022, 5:00PM - 7:00PM (in person in BA2270, CS Help Centre).
Thursday, October 13, 2022, 8:00AM - 10:00AM (virtual).
Friday, October 14, 2022, 4:00PM - 7:00PM (in person in BA2270, CS Help Centre).
Assignment 3 [hw3.pdf]
[code_and_data.zip]
Thursday, October 20, 2022, 8:00AM-10:00AM (virtual).
Tuesday, October 25, 2022, 11:00AM-1:00PM, PT286.
Thursday, October 27, 2022, 8:00AM-10:00AM (virtual).
Thursday, October 27, 2022, 1:00PM-3:00PM, PT286.
Friday, October 28, 2022, 5:00PM-7:00PM, CS Help Centre.
Monday, October 31, 2022, 5:00PM-7:00PM, CS Help Centre.

Final Project

For your final project, you will attempt to solve a Netflix-Competition-style matrix completion problem. The goal is to predict, in the context of a personalized education platform, whether a student will correctly answer a diagnostic question. In groups of 2-3, you will implement and evaluate several algorithms from the course, and then propose and evaluate an extension to one of these algorithms. This will hopefully be a fun exercise that gives you a feel for what you'd do on a daily basis as a data scientist or machine learning engineer.

We will post the instructions and the start code at a later date.

TA Office Hours: TBD

Midterm

The midterm will occur in person during tutorial time slots. See the detailed course schedule above. As an aid, you can bring one 8.5' x 11' double-sided sheet of paper to the midterm.

TA Office Hours: TBD

Final Exam

The final exam schedule will become available around the end of November 2022. Check the A&S page for the final exam schedule. Please do not make travel plans before the final exam schedule is released.

TA Office Hours: TBD

Academic Integrity

Academic integrity is essential to the pursuit of learning and scholarship in a university, and to ensuring that a degree from the University of Toronto is a strong signal of each student’s individual academic achievement. As a result, the University treats cases of cheating and plagiarism very seriously. The University of Toronto’s Code of Behaviour on Academic Matters outlines the behaviours that constitute academic dishonesty and the processes for addressing academic offences. Potential offences include, but are not limited to:

In papers and assignments: On tests and exams: In academic work:

All suspected cases of academic dishonesty will be investigated following procedures outlined in the Code of Behaviour on Academic Matters. If students have questions or concerns about what constitutes appropriate academic behaviour or appropriate research and citation methods, they are expected to seek out additional information on academic integrity from their instructors or from other institutional resources.

Remark Request Policy

If you discover a marking error on an assignment or the midterm, you can submit a remark request. We will consider remark requests up to two weeks after we release the marks for an assignment or the midterm.

Once the two-week period has passed, we will process all the requests as soon as possible.

Special Considerations Policy

If you are unable to complete an assignment on time or write a test due to extraordinary circumstances beyond your control, please apply for a Special Consideration by filling out this special considerations form and sending it to the course email with your supporting documentation. A special consideration request, particularly if it is not your first request in the course, would not be granted automatically.

Legitimate reasons to apply for a special consideration request:

A heavy course load, multiple assignments/tests scheduling during the same period, and time management issues are not appropriate reasons to grant special considerations. Such accommodations are meant for exceptional circumstances only and not as a means to catch up on term work. If you are having difficulty with stress and time management, please contact your college registrars, who can in turn suggest wellness counselling, academic advising, and/or learning strategists services.

Our special considerations policies are as follows.

Student Support Resources

Recommended Textbooks

Suggested readings are optional; they are resources we recommend to help you understand the course material. All of the textbooks listed below are freely available online.

Bishop: Pattern Recognition and Machine Learning.
Hastie, Tibshirani, and Friedman: The Elements of Statistical Learning.
MacKay: Information Theory, Inference, and Learning Algorithms.
Barber: Bayesian Reasoning and Machine Learning.
Sutton and Barto: Reinforcement Learning: An Introduction.
Deisenroth, Faisal, and Ong: Math for ML.
Shalev-Shwartz and Ben-David: Understanding Machine Learning: From Theory to Algorithms.
Kevin Murphy: Machine Learning: a Probabilistic Perspective.

Lecture and Tutorial Materials

We will post lecture and tutorial slides below as the term goes on.
# Dates Topic Materials Suggested Readings
1 9/13, 9/14 Lecture: Introduction, Nearest Neighbours
Tutorial: Probability Review
Lecture: [Slides]
Tutorial: [Slides]
Code: [Colab Notebook]

ESL: 1, 2.1-2.3, 2.5

2 9/20, 9/21 Lecture: Decision Trees, Bias-Variance Decomposition
Tutorial: Linear Algebra Review
Lecture: [Slides] [Slides Annotated by Alice]
Tutorial: [Slides]
Code: [Colab Notebook]
Bishop: 3.2
ESL: 2.9, 9.2
Course notes: Notes on Generalization, Bias-Variance Decomposition
3 9/27, 9/28 Lecture: Linear Models I
Tutorial: Bias-Variance Decomposition
Lecture: [Slides] [Slides Annotated by Alice]
Tutorial: [Slides]
Bishop: 3.1
ESL: 3.1 - 3.2
Course notes: Linear Regression, Calculus
4 10/4, 10/5 Lecture: Linear Models II
Tutorial: Optimization
Lecture: [Slides] [Slides Annotated by Alice]
Tutorial: [Slides] Code: [Colab Notebook]
Bishop: 4.1, 4.3
ESL: 4.1-4.2, 4.4, 11
Course notes: Linear Classifiers, Training a Classifier
5 10/11, 10/12 Lecture: Linear Models III, Neural Nets I
Tutorial: PyTorch
Lecture: [Slides] [Slides Annotated by Alice]
Tutorial: [Slides] Code: [Colab Notebook]
6 10/18, 10/19 Lecture: Neural Networks II
Tutorial: Midterm Review
Lecture: [Slides]
Tutorial: [Slides]
Bishop: 5.1-5.3
Course notes: Multilayer Perceptrons, Backpropagation
7 10/24, 10/25 Lecture: Probabilistic Models
Tutorial: midterm test
8 11/1, 11/2 Lecture: Multivariate Gaussians, GDA
Tutorial: Linear Algebra Review II: Eigenvalues, SVD
9 11/15, 11/16 Lecture: Principal Component Analysis, Matrix Completion
Tutorial: Final Project Overview
10 11/22, 11/23 Lecture: Embedded Ethics Unit on Recommender Systems
Tutorial: no tutorial this week
11 11/29, 11/30 Lecture: k-Means, EM Algorithm
Tutorial: EM Algorithm
12 12/6, 12/7 Lecture: Reinforcement learning
Tutorial: Final Exam Review

Computing Resources

For the assignments, we will use Python 3, and libraries such as NumPy, SciPy, and scikit-learn. You have two options:
  1. The easiest option is probably to install everything yourself on your own machine.

    1. If you don't already have python 3, install it.

      We recommend some version of Anaconda (Miniconda, a nice lightweight conda, is probably your best bet). You can also install python directly if you know how.

    2. Optionally, create a virtual environment for this class and step into it. If you have a conda distribution run the following commands:

          conda create --name csc311
          source activate csc311
    3. Use pip to install the required packages

          pip install scipy numpy autograd matplotlib jupyter sklearn
  2. All the required packages are already installed on the Teaching Labs machines.