- Please see the Quercus page for this course.
- Lectures and tutorials will be conducted online using Zoom. A more-detailed announcement on this will be made shortly.
- The midterm test and the final exam will be conducted online using Zoom and Markus.
- You will need a video camera, a microphone and a speaker for the midterm and the final exam.

- Machine learning aims to build computer systems that learn from experience, instead of being directly programmed. It is an exciting interdisciplinary field, with historical roots in computer science, statistics, pattern recognition, and even neuroscience and physics. In the past ten years, many of these approaches have converged and led to rapid advances and real-world applications.
- This course is a broad introduction to machine learning. It will start with basic methods of regression and classification and problems of over fitting and the evaluation of learning algorithms, and then move on to more sophisticated methods such as neural networks. Besides reinforcing what you learn in class, the homework assignments will extend your Python skills and introduce you to the basics of scientific programming, data visualization and computational statistics, all of which are ubiquitous in machine learning. As a fringe benefit, you will also find out what all that math you learned is actually used for!

- Formally required: CSC207, (MAT223 or MAT240), MAT232 and STA256.
- Recommended: CSC338 (Numerical Methods) or a course in computational statistics.
- Informally required: a solid knowledge of calculus, linear algebra, probability, computer programming (including Python) and good geometric intuition.
- Machine learning is highly mathematical, and the ability to write and understand rigorous proofs is essential, as is the ability to use mathematics to solve real problems (as in Physics and Engineering). Consequently, mathematical maturity will be assumed.
- Prerequisites will not be waived.

- Anthony Bonner
- email: bonner [at] cs [dot] toronto [dot] edu
- Phone: 905-828-3813 (UTM), 416-978-7441 (St George)
- Office: DH 3090 (UTM), BA 5230 (St George)
- Office hours: Tues and Weds 4-5pm online using Zoom.

- Course syllabus (The same
for all lecture sections)

- Classes: Weds 9-11am, Tues 5-7pm, Weds 5-7pm, online
using Zoom.

- Tutorials:
- Friday at 9am, 10am, 11am, noon and 5pm, online using Zoom.

- There are nine tutorial sections.

- Teaching Assistants: TBA
- Textbook: There is no required text, but we will recommend specific readings from various
books and papers, but mostly from The
Elements of Statistical Learning (ELS), by Hastie,
Tibshirani and Friedman. The book can be downloaded for
free as a pdf file.

- ATTENDANCE: We expect students to attend all classes and all tutorials. This is especially important because we will cover material in class that is not included in the textbook. Also, the tutorials will not only be for review and answering questions, but new material will also be covered.
- In general, the lectures will outline the theory of machine learning, the tutorials will provide additional details, examples and guidance, and the assignments will help you turn the theory into practice. Doing the assignments is where you will really learn machine learning!
- The assignments will require proving theorems and writing programs. Often, your programs will implement a theorem you have proved.
- The assignments will require you to do scientific programming
on large matrices and vectors. Most of you have never done
this before. Scientific programming minimizes the use of
loops and maximizes the opportunities for massive parallel
programming using GPUs. As you will discover, without
this, machine learning is impossibly slow, since programs can
take years instead of minutes to finish executing.

- Lecture slides
- Tutorial Slides

- The Python 3.8
programming language. Be sure to install a 64-bit
version. (A 32-bit version is not accurate enough for
serious numerical computing and can result in wrong answers,
which may cost you marks.)

- The NumPy libraries (Numerical Python)
- The SciPy libraries (Scientific Python)
- The scikit-learn libraries (machine learning in Python)
- The Spyder IDE (Scientific Python Development Environment) (optional).
- Recommendations:

- Use Conda, not Pip, to install all Python-related software.
- Create a Conda virtual environment to install all software in, because some software (notably 64-bit Python) may conflict with other software already installed on your computer.
- Use Anaconda, a Python-based data-science platform that includes many popular data-science packages, including NumPy, SciPy, scikit-learn and Spyder. This is by far the easiest way to go.
- Last year, there were many reported problems with running
Python on Windows. Here
is some advice for dealing with them. (The main piece of
advice is to use Anaconda.) Googling "problems with
Python on Windows" seems to indicate problems when using
Windows 10. If you are using Windows, please see if the
following installation sequence works.

- Recommended installation sequence:
- Open a terminal window (Mac) or a command line window
(Windows).

- Install Conda
(or update to the latest version by typing
*conda update conda*) before doing anything else. - Create a Conda virtual environment (call it
*csc311*)

- Activate the virtual environment

- Use Conda to install Python 3.8 (64-bit version).

- Use Conda to install Anaconda
- Within your virtual environment, run anaconda-navigator, from which you can launch Spyder.
- In Linux and Mac (and maybe on Windows), the following
sequence of commands will accomplish this (after conda has
been installed or updated):

- conda create —name csc311

- conda activate csc311

- conda install python=3.8
- conda install anaconda
- anaconda-navigator

- To leave the virtual environment, type
*conda deactivate* - To re-enter the virtual environment, type
*conda activate csc311*

- You must enter the virtual environment, run anaconda-navigator and then launch Spyder each time you want to write and run programs for this course.
- We recommend the above sequence for installing the software
for this course. You may, however, install the software
in any way you see fit. However, if you do not use the
recommended installation sequence, we may not be able to help
you with software problems that you encounter.

- Documentation:

- Numpy quick-start tutorial (highly recommended)
- SciPy lecture notes by Valentin Haenel, Emmanuelle Gouillart, and Gael Varoquaux (eds)
- Tutorial on machine learning in scikit-learn

- All assignments are to be done individually, without collaboration with others.
- Assignments should be submitted electronically at Markus.
- Submission instructions

- Assignment 1 (Due
Oct 8)

- Data files for Assignment 1:

- Solutions for
Assignment 1.

- Assignment 2 (Due Nov 8)
- Data files for Assignment 2:

- Solutions for
Assignment 2.

- Assignment 3 (Due Dec
7)

- Solutions for
Assignment 3.

- Online using Zoom and Markus.

- You will need a video camera, a microphone and a speaker.
- The test is open book.

- A summary of
the procedures for taking the midterm test online.

- Detailed instructions
for taking the test on-line. You should read these carefully
well before the test date.

- Friday Oct 22(?), 8-9pm, plus 15 minutes to upload your
answers.

- You are also responsible for all material covered before the midterm in tutorials, assignments and assignment solutions.
- There will be some Python programming questions on the midterm.
- The midterm test will follow the "I don't know" policy: if you do not know the answer to a question, and you write "I don't know", you will receive 20% of the marks of that question. If you just leave a question blank with no such statement, you will get 0 marks for that question.
- Here is an old midterm test, and here are the solutions.
- Midterm solutions

- Online using Zoom and Markus, like the midterm.
- You will need a video camera, a microphone and a speaker.

- The exam is open book
- A summary of the procedures for taking the exam online.
- Detailed instructions for taking the exam on-line. You should read these carefully well before the exam date.
- You must receive at least 30% on the final exam to pass the course.
- The exam will cover the entire course, but will emphasize material not on the midterm.
- The most difficult questions (and the most marks) will be on
material related to the assignments, since this is what you know
best.

- You are responsible for all lectures, tutorials, assignments and assignment solutions.
- There will be some Python programming questions on the exam.

- The exam will follow the "I don't know" policy: if you do not know the answer to a question, and you write "I don't know", you will receive 20% of the marks of that question. If you just leave a question blank with no such statement, you will get 0 marks for that question.
- Here is an old exam and the solutions. (Note: this exam does not cover exactly the same material that we covered this year, but there is a lot of overlap.)
- More details will be published shortly.

- The deferred exam will also be online and will follow the
rules and policies given above for the final exam.

- Here is a summary of the procedures for taking the deferred exam
- Here are detailed
instructions for taking the deferred exam online.

- You should read the summary and detailed instructions
carefully before the exam date. They are similar to those
given above for the final exam.

- Trevor Hastie, Robert Tibshirani, and Jerome Friedman,
*The Elements of Statistical Learning*, Second Edition. (ESL)

- Christopher Bishop, Pattern Recognition and
Machine Learning, 2006. Free downloadable pdf.
(PRML)

- Richard S. Sutton and Andrew
G. Barto,
*Reinforcement Learning: An Introduction, Second Edition, 2018. (RL)*

- Ian Goodfellow, Yoshua Bengio
and Aaron Courville,
*Deep Learning*, 2016.

- David Mackay, Information
Theory, Inference, and Learning Algorithms.

- Ethan Alpaydin, Introduction
to Machine Learning, 2nd Edition, 2010. (Good for
undergrads)

- Kevin Murphy,
*Machine Learning: a Probabilistic Perspective*. (advanced) - Gareth James, Daniela Witten, An
Introduction to Machine Learning, Trevor Hastie, and
Robert Tibshirani,
*An Introduction to Statistical Learning*, 2017. - Shai Shalev-Shwartz and Shai Ben-David, Understanding Machine Learning: From Theory to Algorithms, 2014.

- Petersen and Pedersen,
*The Matrix Cookbook*. Free Download - F. R. Kschischang,
*Probability Refresher*. Free Download - Lipschutz and Lipson,
*Schaum's Outline of Linear Algebra*. (very handy, very cheap) - Wrede and Spiegle,
*Schaum's Outline of Advanced Calculus*. (very handy, very cheap)

- Linear algebra
- Calculus
- Neural networks
- Maximum likelihood estimation
- Towards Data Science
- Metacademy

- 3Blue1Brown

- Students should become familiar with and are expected to adhere to the Code of Behaviour on Academic Matters, which can be found in the UTM Calendar. The following web sites may also be helpful: