CSC 411 Winter 2019

Machine Learning and Data Mining

Overview

Machine learning is a set of techniques that allow machines to learn from data and experience, rather than requiring humans to specify the desired behavior by hand. Over the past two decades, machine learning techniques have become increasingly central both in AI as an academic field, and in the technology industry. This course provides a broad introduction to some of the most commonly used ML algorithms. It also serves to introduce key algorithmic principles which will serve as a foundation for more advanced courses, such as CSC412/2506 (Probabilistic Learning and Reasoning) and CSC421/2516 (Neural Networks and Deep Learning).

The first half of the course focuses on supervised learning. We begin with nearest neighbours, decision trees, and ensembles. Then we introduce parametric models, including linear regression, logistic and softmax regression, and neural networks. We then move on to unsupervised learning, focusing in particular on probabilistic models, but also principal components analysis and K-means. Finally, we cover the basics of reinforcement learning.

Syllabus

For course policies, please see the syllabus [link].

Piazza

Students are encouraged to sign up Piazza [link] to join course discussions.

Where and When

	Section 1	Section 2
Instructor	Mengye Ren	Matthew MacKay
Lecture Time	Tuesday 13:00 - 15:00	Thursday 18:00 - 20:00
Lecture Room	SS 2135	SF 1105
Tutorial Time	Thursday 14:00 - 15:00	Thursday 20:00 - 21:00
Tutorial Room	SS 2135	SF 1105
Office Hour Time	Tuesday 15:00 - 16:00	Monday 14:00 - 15:00
Office Hour Room	BA 2283	BA 2283

Important Dates

Date	Event
2019.01.07	Semester begins
2019.01.17	Wait list turned off
2019.01.20	Last day to add a course
2019.02.15	Midterm at 18:00 - 19:00 EX 100
2019.02.18	Family Day - University closed
2019.03.17	Last day to drop a course / add CR/NCR option
2019.04.05	Last day of the classes
2019.04.19	Good Friday - University closed
2019.04.25	Final exam 09:00 - 12:00 BN 3

Teaching Staff

Instructors: Mengye Ren, Matthew MacKay
Head TA: Shengyang Sun
TAs:
- Rasa Hosseinzadeh
- Shun Liao
- Silviu Pitis
- Shengyang Sun
- Jixuan Wang
- Xiaohui Zeng
Emails: csc411-2019-01@cs.toronto.edu (Administration only)

Homeworks

Unless specified, homeworks will be due at 23:59 of the date.

	Out	Due	Materials	TA Office Hours
Homework 1	01.18	01.25	[handout] [data]	01.22 11:00 - 12:00 BA 2283 01.23 18:00 - 19:00 BA 2283
Homework 2	01.25	02.01	[handout]	01.29 11:00 - 12:00 BA 2283 01.30 18:00 - 19:00 BA 2283
Homework 3	02.01	02.08	[handout] [q2.py]	02.05 11:00 - 12:00 BA 2283 02.06 18:00 - 19:00 BA 2283
Homework 4	03.01	03.08	[handout]	03.05 11:00 - 12:00 BA 2283 03.06 18:00 - 19:00 BA 2283
Homework 5	03.10	03.20	[handout] [q1.py] [data.py] [data]	03.19 11:00 - 12:00 BA 2283 03.20 18:00 - 19:00 BA 2283
Homework 6	03.19	03.27	[handout] [data]	03.26 11:00 - 12:00 BA 2283 03.27 18:00 - 19:00 BA 2283
Homework 7	03.26	04.03	[handout]	04.02 11:00 - 12:00 BA 2283 04.03 18:00 - 19:00 BA 2283
Homework 8	03.26		[handout] [soln.py]

Exams

Midterm

Past exam: Winter 2018 midterm [questions] [solution]
Exam: [questions] [solution] [remark form]

Final

University past exam library: [link]
Practice questions: [questions] [solution]

Exam schedule

	Date	Time	Location
Midterm office hour	02.13	18:00 - 19:00	BA 2283
Midterm office hour	02.14	16:00 - 17:00	BA 3201
Midterm	02.15	18:00 - 19:00	EX 100
Final office hour	04.23	11:00 - 12:00	BA 2283
Final office hour	04.24	18:00 - 19:00	BA 2283
Final	04.25	09:00 - 12:00	BN 3

Lectures

Notes: Slides are mostly reused from CSC 411 2018 Fall by Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla [link].

	Date	Topic	Materials
Lecture 1	01.08, 01.10	Introduction	[slides] ESL Chapter 1
Lecture 2	01.08, 01.10	kNN	[slides] [slides section 2] ESL 2.1-2.3, 2.5
Lecture 3	01.15, 01.17	Decision Trees	[slides] [slides section 2] ESL 9.2
Lecture 4	01.15, 01.17	Ensemble I	[slides] [slides section 2] ESL 2.9, 8.7, 15
Lecture 5	01.22, 01.24	Ensemble II	[slides] [slides section 2] ESL: 10.1
Lecture 6	01.22, 01.24	Linear Regression	[slides] [slides section 2] ESL: 2.3, 3.1-3.2.1
Lecture 7	01.29, 01.31	Linear Classification I	[slides] [CSC 321 notes]
Lecture 8	01.29, 01.31	Linear Classification II	[slides] [CSC 321 notes]
Lecture 9	02.05, 02.07	SVM & Boosting	[slides] [slides section 2] ESL 10.2-10.6, 12.2-12.3
Lecture 10	02.05, 02.07	Neural Networks I	[slides] [slides section 2] [CSC 321 notes 1] [CSC 321 notes 2]
Lecture 11	02.12, 02.14	Neural Networks II	[slides] [CSC 321 notes 1] [CSC 321 notes 2]
Lecture 12	02.12, 02.14	PCA	[slides] ESL: 14.5.1
Lecture 13	02.26, 02.28	Probabilistic Models I	[slides] [notes] MacKay: Chapter 23
Lecture 14	02.26, 02.28	Probabilistic Models II	[slides] MacKay: Chapters 21, 24
Lecture 15	03.05, 03.07	k-Means	[slides] [slides section 2] MacKay: Chapter 20 Bishop: 9.1
Lecture 16	03.05, 03.07	GMM	[slides] [slides section 2] Barber: 20.1-20.3 Bishop: 9.2-9.4
Lecture 17	03.12, 03.14	EM	[slides]
Lecture 18	03.12, 03.14	Matrix Factorization	[slides] [slides section 2]
Lecture 19	03.19, 03.21	Bayesian Linear Regression	[slides] [slides section 2] Bishop: 3.3
Lecture 20	03.19, 03.21	Gaussian Processes	[slides] [slides section 2] Bishop: 6.1-6.2, 6.4.1-6.4.3
Lecture 21	03.26, 03.28	Reinforcement Learning I	[slides] Sutton and Barto: 3, 4.1, 4.4, 6.1-6.5
Lecture 22	03.26, 03.28	Reinforcement Learning II	[slides] Sutton and Barto: 3, 4.1, 4.4, 6.1-6.5
Lecture 23	04.02, 04.04	Algorithmic Fairness	[slides]
Lecture 24	04.02, 04.04	Review and Outlook	[slides]

Tutorials

	Date	Topic	Materials
Tutorial 1	01.10	Probability review	[slides]
Tutorial 2	01.17	Linear algebra review	[slides] [demo] [exercise]
Tutorial 3	01.24	Gradient descent	[slides] [demo] [exercise]
Tutorial 4	01.31	Linear algebra review	[slides] [demo] [exercise]
Tutorial 5	02.07	Midterm review	[slides]
Tutorial 6	02.28	MCMC	[slides]
Tutorial 7	03.14	Multivariate Gaussian	[slides]
Tutorial 8	03.21	Bayesian optimization	[slides] [notebook 1] [notebook 2]
Tutorial 9	03.28	Reinforcement learning	[notebook]
Tutorial 10	04.04	Final review	[slides]

Paper Readings

5% of your total mark is allocated to reading a set of classic machine learning papers. We hope these papers are both interesting and understandable given what you learn in this course. Please select 2 papers of your interest from the reading list below. You will need to hand in reading notes for the papers you select. The notes should include a summary of the paper's main contribution and your view of the paper's strengths and weaknesses. Submit the notes on MarkUs under file name reading.pdf. A completion mark of 5% will be given.

Viola, Paul, and Michael Jones. "Rapid object detection using a boosted cascade of simple features." Computer Vision and Pattern Recognition, 2001. [pdf]
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012. [pdf]
Mnih, Andriy, and Ruslan R. Salakhutdinov. "Probabilistic matrix factorization." Advances in neural information processing systems. 2008. [pdf]
Olshausen, Bruno A., and David J. Field. "Sparse coding with an overcomplete basis set: A strategy employed by V1?." Vision research 37.23 (1997): 3311-3325. [pdf]
Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." Nature 518.7540 (2015): 529. [pdf]
Hardt, Moritz, Eric Price, and Nati Srebro. "Equality of opportunity in supervised learning." Advances in neural information processing systems. 2016. [pdf]
Tsochantaridis, Ioannis, et al. "Large margin methods for structured and interdependent output variables." Journal of machine learning research 6.Sep (2005): 1453-1484. [pdf]
Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." Advances in neural information processing systems. 2014. [pdf]
Ren, Shaoqing, et al. "Faster r-cnn: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015. [pdf]
Coates, Adam, and Andrew Y. Ng. "The importance of encoding versus training with sparse coding and vector quantization." Proceedings of the 28th international conference on machine learning. 2011. [pdf]
Kingma, Diederik P., and Max Welling. "Auto-encoding variational bayes." Proceedings of the 2nd international conference on learning representations. 2014. [pdf]
Bottou, Léon, and Olivier Bousquet. "The tradeoffs of large scale learning." Advances in neural information processing systems. 2008. [pdf]
Neal, Radford and Hinton, Geoffrey. “A view of the EM algorithm that justifies incremental, sparse, and other variants.” Learning in graphical models. 1999. [pdf]
Tipping, Michael and Bishop, Christopher. “Probabilistic principal component analysis.” Journal of the royal statistical society. 1999. [pdf]