- Complete grades for assignments 1 and 2 are now available on
Quercus.

- The final exam will cover Lectures 1, 2, 3, 4, 5 and 7, 8, 9, 10 and 12, 13, but not Lecture 17 (ensemble methods).
- The final exam will cover everything in the course (except
ensemble methods), but will focus on material covered in the
assignments (since you know it best) and will not focus on
material covered in the midterm.

- Last year's exam and solutions are now available, below.

- Solutions to Assignment 3 are now available, below.

- As announced on Quercus, the accuracies and numbers of
iterations have been updated in Questions 3(a), (b) and (c) of
Assignment 3.

- You will have some time during the last class (Dec 2 and 4) to perform online course evaluations, so please bring a laptop, tablet or smart phone to class.
- In Question 3 of Assignment 3, cross entropy refers to the the
cross entropy computed wrt the data that the neural net was
trained on.

- I have clarified some items regarding accuracy in Question 3
of Assignment 3.

- I have clarified the use of loops in Question 3 of Assignment
3.

- Assignment 3 is now complete. No more questions will be
added.

- Question 3 of Assignment 3 is now partially available.
More details will be added shortly.

- A typo in question 2(a) of Assignment 3 has been fixed.

- This Weds, Nov 27, my office hours will be half an hour
earlier, i.e., from 5:30-6:30pm.

- Questions 1(d) and (e) in Assignment 3 have been clarified.

- Question 2 of Assignment 3 is now available.

- Solutions for Assignment 2 are now available, below.

- Please read the new material on page 2 of Assignment 3, right
after the cover page.

- The first question of Assignment 3 is now available. More questions will be added shortly.
- In Questions 5(a) and 5(g) of Assignment 2, you may use one
loop each to create an array of subplots.

- In all assignments, you should avoid the use of higher-order
functions, such as the Python map function and any Numpy
function listed under "functional programming", such as
apply-along-axis. These are just loops in disguise.

- Question 2(e) on Assignment 2 has been clarified.

- Some typos in Question 5(b) and 5(h) on Assignment 2 have been
corrected.

- Question 3(b) on Assignment 2 has been clarified.

- Marked midterms can be picked up in my office during office
hours.

- In Question 2(d) of Assignment 2, do not use any built-in
functions to draw the decision boundary. (Details are now
on the assignment.)

- In Question 3(a) of Assignment 2, the problem is to show that
the given formula is the maximum-likelihood solution (see slide
17).

- Do not use any higher-order Python functions (such as map) in
Assignment 2.

- Assignment 2 is now complete. No more questions will be
added.

- A sample plot of decision boundaries has been added to
Question 4(a) of Assignment 2.

- The first seven parts of Question 5 of Assignment 2 are now
available. Please reread all parts, as some requirements
have changed slightly.

- The requirement for an explanation has been removed from
Question 5(c) of Assignment 2.

- The first four parts of Question 5 of Assignment 2 are now
available.

- Question 4 of Assignment 2 is now complete.

- Midterm solutions are available below.
- Solutions to Assignment 1 are available below.

- Questions 4 and 5 of Assignment 2 are now partially available.

- Marked midterms will be returned in tutorial on Nov 7.

- The first three questions of Assignment 2 are now available.

- Question 2 of Assignment 2 is now complete.

- The first question of Assignment 2 is now available, below.

- The midterm test will cover lectures 1, 2, 3, 4 and 5 and the
first half of lecture 7 (up to and including slide 13, where
cross-entropy is defined).

- The midterm test will not cover Decision Trees (lecture 6).

- A link to an old midterm is now available, below.

- The latest version of Assignment 1 has a marks breakdown.

- There is a typo in Question 5(b) of Assignment 1. You
should print out the validation error, not the test error.

- The MIDTERM TEST will be in the first class after reading week
(i.e., Oct 21 and 23). See details below.

- Assignment 1 is now complete. No more questions will be added.
A marking scheme will be added shortly.

- In Question 4(a), you should use at most one loop.

- In Question 4(c) of assignment 1, do not use any outputs of the function lstsq other than the weight vector, w.
- The first five questions of assignment 1 are now
available. One more question will be added shortly.

- After a slight change, the course information sheet has been
approved by the powers that be. It is available on the
course web site. (The only change is to the policy on missed
term work.)

- The first question of assignment 1 is now available, below.
More questions will be added shortly.

- The first tutorials will be on Thursday Sept 12.

- Machine learning aims to build computer systems that learn from experience, instead of being directly programmed. It is an exciting interdisciplinary field, with historical roots in computer science, statistics, pattern recognition, and even neuroscience and physics. In the past ten years, many of these approaches have converged and led to rapid advances and real-world applications.
- This course is a broad introduction to machine learning. It will start with basic methods of regression and classification and problems of over fitting and the evaluation of learning algorithms, and then move on to more sophisticated methods such as neural networks. As part of the course, you will expand your Python skills to include numerical and scientific programming. As a fringe benefit, you will also find out what all that math you learned is actually used for!

- Formal: CSC207H5, 290H5,
(MAT134H5/136H5/134Y5/135Y5/137Y5/157Y5/233H5)/(MAT133Y5, 233H),
MAT223H5/240H5, STA256H5

- Informal: a basic knowledge of calculus, linear algebra, probability, geometry and computer programming, including Python.
- Recommended: CSC338 (Numerical Methods).

- Mathematical maturity will be assumed.
- Prerequisites will not be waived.

- Anthony Bonner
- email: bonner [at] cs [dot] toronto [dot] edu
- Phone: 905-828-3813 (UTM), 416-978-7441 (St George)
- Office: DH 3090 (UTM), BA 5230 (St George)
- Office hours: Mon,Weds 6-7pm

- Course information sheet:
- for lecture section 101 (Mon)
- for lecture section 102 (Wed)
- These two sheets are identical except for the dates of the midterm, which will be held in class.
- You must attend the midterm for the lecture section you are
enrolled in.

- Classes:

- Mon 3-5pm in CC 2150

- Weds 3-5pm in MN 2110
- Tutorials:
- Thursday 10-11am in IB 210

- Thursday 11am-noon in MN 2100

- Thursday noon-1pm in IB 385
- Thursday 1-2pm in IB 210

- Teaching Assistants:
- Mohan Zhang

- Mustafa Ammous

- Hamed Heydari (marking only), h [dot] heydari [at] mail [dot] utoronto [dot] ca
- Textbook: There is no required text, but we will recommend specific readings from Pattern Recognition and Machine Learning by Christopher Bishop.
- ATTENDANCE: We expect students to attend all classes and all tutorials. This is especially important because we will cover material in class that is not included in the textbook. Also, the tutorials will not only be for review and answering questions, but new material will also be covered.
- Lecture slides
- Tutorial Slides

- The Python 2.7
programming language. (Python 3.6 works most of the time, but
will occasionally produce frustrating errors or program
behaviour that you can't figure out, so please don't use
it.) Choose a 64-bit version.

- The NumPy libraries (Numerical Python)
- The SciPy libraries (Scientific Python)
- The scikit-learn libraries (machine learning in Python)
- The Spyder IDE (Scientific Python Development Environment) (optional)
- The easiest way to install this software on your own computer is to download and install Anaconda, a Python-based data-science platform that includes many popular data-science packages, including NumPy, SciPy, scikit-learn and Spyder.
- Documentation:

- Numpy quick-start tutorial (highly recommended)
- SciPy lecture notes by Valentin Haenel, Emmanuelle Gouillart, and Gael Varoquaux (eds)
- Tutorial on machine learning in scikit-learn

- All assignments are to be done individually, without collaboration with others.
- Assignments should be submitted electronically at the UTORSubmit website.
- Submission instructions

- Assignment 1 (Due
Friday Oct 11) No more questions will be added.

- Data files for Assignment 1: data1.pickle.zip and data2.pickle.zip
- Solutions
for Assignment 1.

- Assignment 2 (Due Friday Nov 15) No more questions will be added.
- Python package for Assignment 2: bonnerlib2.py.zip
- MNIST data file for Assignment 2: mnist.pickle.zip (compressed), mnist.pickle (uncompressed)
- Solutions for
Assignment 2.

- Assignment 3 (Due Weds
Dec 4) No more questions will be added.

- Solutions
for Assignment 3.

- In class: Mon Oct 21 and Weds Oct 23. You must attend
the midterm for the lecture section you are enrolled in.
(The two midterms will be different.)

- Starts at 3:15pm sharp!
- 50 minutes, ends at 4:05pm.
- There will be a short break and a lecture after the test.
- CLOSED BOOK.
- CHEAT SHEET: You will be allowed a 1-page "cheat sheet" (8.5x11in, single-sided). It should contain no more than 6,000 characters. If typed, it should be in 12pt font or larger.
- No other aids are allowed.
- You are responsible for all material covered before the midterm, including lectures, tutorials, assignments and assignment solutions.
- There
*will*be some Python programming questions on the midterm. - The midterm test will follow the "I don't know" policy: if you do not know the answer to a question, and you write "I don't know", you will receive 20% of the marks of that question. If you just leave a question blank with no such statement, you will get 0 marks for that question.
- Here is an old midterm test, and here are the solutions.
- Midterm solutions for lecture sections LEC0101 and LEC0102

- CLOSED BOOK.
- You must receive at least 40% on the final exam to pass the course.
- CHEAT SHEET: You will be allowed a 1-page "cheat sheet" (8.5x11in, two-sided). It should contain no more than 12,000 characters. If typed, it should be in 12pt font or larger.
- No other aids are allowed.
- The exam will cover the entire course, but will emphasize material not on the midterm.
- You are responsible for all lectures, tutorials, assignments and assignment solutions.
- There
*will*be some Python programming questions on the exam.

- The exam will follow the "I don't know" policy: if you do not know the answer to a question, and you write "I don't know", you will receive 20% of the marks of that question. If you just leave a question blank with no such statement, you will get 0 marks for that question.
- Here is an old exam
and the solutions.

- David Mackay,
*I**nformation Theory, Inference, and Learning Algorithms*. Free Download - Ethan Alpaydin,
*Introduction to Machine Learning*, 2nd Edition, 2010. (Good for undergrads)

- Kevin Murphy,
*Machine Learning: a Probabilistic Perspective*. (advanced)

- Petersen and Pedersen,
*The Matrix Cookbook*. Free Download - F. R. Kschischang,
*Probability Refresher*. Free Download - Lipschutz and Lipson,
*Schaum's Outline of Linear Algebra*. (very handy, very cheap) - Wrede and Spiegle,
*Schaum's Outline of Advanced Calculus*. (very handy, very cheap) - Useful videos:

- Students should become familiar with and are expected to adhere to the Code of Behaviour on Academic Matters, which can be found in the UTM Calendar. The following web sites may also be helpful: