Homepage for CSC311, Fall 2020
Introduction to Machine Learning
Department of Mathematical and Computational Sciences
University of Toronto Mississauga
ANNOUNCEMENTS:
- Instructions for taking the deferred exam online are given
below. Please read them carefully before the exam date.
- Instructions for taking the midterm test online are now
available below. Please read them carefully before the
test date.
- We will be using the Python 3.8 programming language.
- Please see the Quercus page for this course.
- Lectures and tutorials will be conducted online, either
through Zoom or Quercus. A more-detailed annoucement on
this will be made shortly.
COURSE DESCRIPTION:
- Machine learning aims to build computer systems that learn
from experience, instead of being directly programmed. It is an
exciting interdisciplinary field, with historical roots in
computer science, statistics, pattern recognition, and even
neuroscience and physics. In the past ten years, many of these
approaches have converged and led to rapid advances and
real-world applications.
- This course is a broad introduction to machine learning.
It will start with basic methods of regression and
classification and problems of over fitting and the evaluation
of learning algorithms, and then move on to more sophisticated
methods such as neural networks. Besides reinforcing what
you learn in class, the homework assignments will extend your
Python skills and introduce you to the basics of scientific
programming, data visualization and computational statistics,
all of which are ubiquitous in machine learning. As a
fringe benefit, you will also find out what all that math you
learned is actually used for!
PREREQUISITES:
- Formal: CSC207H5, MAT223H5, MAT232H5.
- Informal: a solid knowledge of calculus, linear algebra,
probability, geometry and computer programming, including
Python.
- Recommended: CSC338 (Numerical Methods) and STA256
(Statistics).
- Mathematical maturity will be assumed.
- Prerequisites will not be waived.
INSTRUCTOR:
- Anthony
Bonner
- email: bonner [at] cs [dot] toronto [dot] edu
- Phone: 905-828-3813 (UTM), 416-978-7441 (St George)
- Office: DH 3090 (UTM), BA 5230 (St George)
- Office hours: Tues and Weds 4-5pm online.
GENERAL INFORMATION :
- Course syllabus (The same
for all three lecture sections)
- Classes:
- LEC9101: Tues 5-7pm online
- LEC9102: Weds 9-11am online
- LEC9103: Weds 5-7pm online
- Tutorials:
- Friday at 9am, 10am and 11am online.
- There are six tutorial sections, two in each time slot.
- Teaching Assistants:
- Mustafa Ammous (9, 10 and 11am)
- Mohammadreza Moravej (10 and 11am)
- Mohammad Alomrani (9am)
- Haotian Yang (marking only)
- Fengjia Zhang (marking only)
- Textbook: There is no required text, but we will recommend specific readings from various
books and papers, but mostly from The
Elements of Statistical Learning (ELS), by Hastie,
Tibshirani and Friedman. The book can be downloaded for
free as a pdf file.
- ATTENDANCE: We expect students to attend all classes and all
tutorials. This is especially important because we will cover
material in class that is not included in the textbook. Also,
the tutorials will not only be for review and answering
questions, but new material will also be covered.
- Lecture slides
- Tutorial Slides
SOFTWARE:
- The Python 3.8
programming language. Be sure to install a 64-bit
version. (A 32-bit version is not accurate enough for
serious numerical computing and can result in wrong answers,
which may cost you marks.)
- The NumPy libraries
(Numerical Python)
- The SciPy libraries
(Scientific Python)
- The scikit-learn
libraries (machine learning in Python)
- The Spyder
IDE (Scientific Python Development Environment) (optional).
- Recommendations:
- Use Conda,
not Pip, to install all Python-related software.
- Create a Conda virtual environment to install all software
in, because some software (notably 64-bit Python) may conflict
with other software already installed on your computer.
- Use Anaconda,
a Python-based data-science platform that includes many
popular data-science packages, including NumPy, SciPy,
scikit-learn and Spyder. This is by far the easiest way
to go.
- There are many reported problems with running Python on
Windows. Here
is some advice for dealing with them.
- Recommended installation sequence:
- Open a terminal window (Mac) or a command line window
(Windows).
- Install Conda
(or update to the latest version) before doing anything else.
- Create a Conda virtual environment
- Activate the virtual environment
- Use Conda to install Python 3.8 (64-bit version).
- Use Conda to install Anaconda
- Within your virtual environment, run anaconda-navigator,
from which you can launch Spyder.
- Documentation:
- Tutorial
on machine learning in scikit-learn
ASSIGNMENTS:
MIDTERM TEST:
- Online and open book.
- A summary of
the procedures for taking the midterm test online.
- Detailed instructions
for taking the test on-line. You should read these carefully
well before the test date.
- Thursday Oct 22, 8-9pm, plus 15 minutes to upload your
answers.
- You are responsible for all material up to and including
Lecture 4 (Neural Nets). You are also responsible for all
material covered before the midterm in tutorials, assignments
and assignment solutions.
- There will be some Python programming questions on the
midterm.
- The midterm test will follow the "I don't know" policy: if
you do not know the answer to a question, and you write "I don't
know", you will receive 20% of the marks of that question. If
you just leave a question blank with no such statement, you will
get 0 marks for that question.
- Here is an old midterm test, and
here are the solutions.
- Midterm solutions
FINAL EXAM:
- Online using Zoom and Markus, like the midterm.
- You will need a video camera, a microphone and a speaker.
- The exam is open book
- A summary of the
procedures for taking the exam online.
- Detailed instructions for
taking the exam on-line. You should read these carefully well
before the exam date.
- You must receive at least 30% on the final exam to pass the
course.
- The exam will cover the entire course, but will emphasize
material not on the midterm.
- The most difficult questions (and the most marks) will be on
material related to the assignments, since this is what you know
best.
- You are responsible for all lectures, tutorials, assignments
and assignment solutions.
- There will be some Python programming questions on the exam.
- The exam will follow the "I don't know" policy: if you do not
know the answer to a question, and you write "I don't know", you
will receive 20% of the marks of that question. If you just
leave a question blank with no such statement, you will get 0
marks for that question.
- Here is an old exam
and the solutions.
(Note: this exam does not cover exactly the same material that
we covered this year, but there is a lot of overlap.)
- More details will be published shortly.
DEFERRED EXAM:
- The deferred exam (in February) will also be online and will
follow the rules and policies given above for the final exam.
- Here is a summary
of the procedures for taking the deferred exam
- Here are detailed
instructions for taking the deferred exam online.
- You should read the summary and detailed instructions
carefully before the exam date. They are similar to those
given above for the final exam.
ADDITIONAL RESOURCES:
Machine Learning Books: Most of the following books are
either readable online as a web page or downloadable as a free
pdf.
- Trevor Hastie, Robert Tibshirani, and Jerome Friedman, The
Elements of Statistical Learning, Second
Edition. (ESL)
- Christopher Bishop, Pattern Recognition and
Machine Learning, 2006. Free downloadable pdf.
(PRML)
- Richard S. Sutton and Andrew
G. Barto, Reinforcement
Learning: An Introduction, Second Edition,
2018. (RL)
- Ian Goodfellow, Yoshua Bengio
and Aaron Courville, Deep Learning,
2016.
- David Mackay, Information
Theory, Inference, and Learning Algorithms.
- Ethan Alpaydin, Introduction
to Machine Learning, 2nd Edition, 2010. (Good for
undergrads)
- Kevin Murphy, Machine Learning: a Probabilistic
Perspective. (advanced)
- Gareth James, Daniela Witten, An
Introduction to Machine Learning, Trevor Hastie, and
Robert Tibshirani, An Introduction to Statistical Learning,
2017.
- Shai Shalev-Shwartz and Shai Ben-David, Understanding
Machine Learning: From Theory to Algorithms, 2014.
Mathematical Background:
- Petersen and Pedersen, The Matrix Cookbook. Free Download
- F. R. Kschischang, Probability Refresher. Free Download
- Lipschutz and Lipson, Schaum's Outline of Linear Algebra.
(very handy, very cheap)
- Wrede and Spiegle, Schaum's Outline of Advanced Calculus.
(very handy, very cheap)
Useful videos and web-pages:
PLAGIARISM AND CHEATING:
- Students should become familiar with and are expected to
adhere to the Code
of Behaviour on Academic Matters, which can be found in
the UTM Calendar. The following web sites may also be helpful: