Homepage for CSC 2545, Fall 2014

Kernel Methods and Support Vector Machines

University of Toronto

ANNOUNCEMENTS:
When Assignment 3 asks to print the decision boundary and margins, it is sufficient to simply print the trained SVM object.
If you are using Matlab on the CDF machines for Assignment 3, you need to type matlab-R2014a at the linux prompt to get the version of Matlab that supports Spider.
Assignment 3 is now complete.
Assignment 3 is due on Thursday December 18 in my office (BA5230).
The first 2 questions of Assignment 3 are now available.
You can look up your CDF account name here. Your student number is your initial password.
The first question of Assignment 3 is now available. More questions will be added shortly.
I have updated the Spider software. Please download and install the new version. (Nov 25)
Assignment 3 will involve Matlab programming using an SVM and machine-learning environment called the Spider. See the instructions below.
If you do not have other access to Matlab, it is available on the CDF machines. All registered students in this course will soon have accounts on CDF. You can remotely login to CDF by ssh to cdf.toronto.edu. See www.cdf.toronto.edu for more details.
Assignment 2 is now complete.
The first five questions of Assignment 2 are now available. More questions will be added shortly.
Assignment 1 is now complete.
Assignment 1 is now available, below. More questions will be added shortly.
The course room has been changed to WB 342 (Wallberg Building).
The first class will be on Thursday Sept 11.
Course Description:

The introduction of Support Vector Machines (SVMs) in the 1990s lead to an explosion of applications and deepening theoretical analysis that have established SVMs as one of the standard tools for machine learning and data mining. They now deliver state-of-the-art performance in real-world applications such as text categorization, hand-written character recognition, image classification, bioinformatics, etc.
This course provides a comprehensive introduction to SVMs and other kernel methods, including theory, algorithms and applications. Topics covered will be selected from the following: support vector classification and regression; novelty detection and feature extraction; non-linear dimensionality reduction; reproducing kernel maps; regularization; statistical learning theory and robust estimation; convex optimization and implementation; kernel design and applications.

Basic information:
Research Area 12 (Machine Learning), Methodology 2 (Continuous Models).
Lectures: Thursday 2-4pm in WB 342. The first class is on Sept 11.
Expected work: Three or four assignments.
Prerequisites: Linear algebra, calculus (including partial derivatives), basic probability, and a willingness to program in Matlab. Mathematical maturity will be assumed.
Instructor:
Anthony Bonner
email: my last name [at] cs [dot] toronto [dot] edu
Phone: 416-978-7441
Office: BA 5230
Office hours: by appointment
Handouts:
Course outline
Lecture slides
Assignments:
Assignment 1 No more questions will be added.
Assignment 2 No more questions will be added.
Assignment 3 No more questions will be added.
Text:
Bernhard Scholkopf and Alex Smola, Learning with Kernels, MIT Press, 2002.
About a third of the book is freely available on the book's web page (click on "Contents"), as are numerous lecture slides.
Additional references:
A quick review of real symmetric matrices.
Cristianini and Shawe-Taylor, An Introduction to Support Vector Machines, Cambridge University Press, 2000.
Shawe-Taylor and Cristianini, Kernel Methods for Pattern Analysis, Cambridge University Press, 2004.
Steinwart and Christmann, Support Vector Machines, Springer, 2008.
Boyd and Vandenberghe, Convex Optimization, Cambridge University Press, 2004. Freely available on the web.
Hastie, Tibshirani and Friedman, The Elements of Statistical Learning, Springer, 2009.
Background material:
Lipschutz and Lipson, Schaum's Outline of Linear Algebra. (very handy, very cheap)
Wrede and Spiegle, Schaum's Outline of Advanced Calculus. (very handy, very cheap)
SVM and machine-learning software:
Links to SVM software and other SVM resources can be found here.
In this course, we will be using the Spider, a complete object-orientated environment for machine learning in Matlab. It can be run on Windows, Linux and Mac OS X (and presumably on any system that supports Matlab). You must have Matlab version 13 or greater installed in order to run the Spider. We will be using the Spider core, not the extras. The Spider website has demos, tutorials, documentation and installation instructions.
Please download an improved version of the Spider software here.
Two of the most popular SVM implementations are SVMlite and LIBSVM. You can use these from within Spider if you like, but they must be downloaded and installed separately and then linked to Matlab.
Spider has its own built-in SVM implementation based on Matlab's quadratic programming facilities. On large data sets, it is not nearly as fast as SVMlite or LIBSVM, but it is much easier for Matlab users to get started with, and it is perfectly adequate for the small data sets used in the assignments in this course.
Matlab:
Matlab Primer.
Matlab Intro.
Prof. Christara's A Brief Introduction to MatLab.
Cleve Moler's Introduction to MATLAB chapter from his new textbook.
Here is a good site for Matlab information and tutorials.
Another good site for Matlab information, tutorials and software.
Octave:
You may use Octave instead of Matlab for homework assignments. However, I cannot guarantee to help you if you have problems. Octave is very similar to Matlab and is freely available on the web, but the user interface is not as convenient.
Instructions for installing and running Octave in Windows.
More details on installing Octave in Windows.
Octave manual
GNU Octave Repository
Octave Wiki
Plagiarism and Cheating:
The academic regulations of the University are outlined in the Code of Behaviour on Academic Matters.
Advice on academic offences