About CSC321
Required math background

Deep Dream 
Announcements
All announcement will be on Piazza.
Teaching team
Instructor: Michael Guerzhoy. Office: BA5244, Email: guerzhoy at cs.toronto.edu (please include CSC321 in the subject, and please ask questions on Piazza if they are relevant to everyone.)
Study Guide
The CSC321 study guide (continuously updated)
Getting help
Michael's office hours: Wednesday 2:303:30, Thursday 67, Friday 23. Or email for an appointment. Or drop by to see if I'm in. Feel free to chat with me after lecture.
Piazza is a thirdparty discussion forum with many features that are designed specifically for use with university courses. We encourage you to post questions (and answers!) on Piazza, and read what other questions your classmates have posted. However, since Piazza is run by company separate from the university, we also encourage you to read the privacy policy carefully and only sign up if you are comfortable with it. If you are not comfortable with singing up for Piazza, please contact me by email to discuss alternative arrangements.
Lectures and Tutorials
L0101/L2001: TR12 in BA1200 (sometimes also R2 in BA1200). Tutorials R2 SS1073 with Alvin/TBA (oddnumbered student#) and SS1084 (evennumbered student#) with Aditya/TBA
L5101/L2501: T68 in BA1220 (sometimes also T8 in BA1220). Tutorials T8, BA2159 with Matt/Aditya/TBA (oddnumbered student#) and BA2185 with Sara/TBA (evennumbered student#)
Software
We will be using the Python 2 NumPy/SciPy stack in this course. It is installed on CDF.
For the first two projects, the most convenient Python distribution to use is Anaconda. If you are using an IDE and download Anaconda, be sure to have your IDE use the Anaconda Python.
I recommend the Pyzo/IEP IDE available with Pyzo. To run IEP on CDF, simply type iep in the command line. You can download my modification of IEP which includes a parentheses matcher here.
We will be using Google's TensorFlow in the second half of the course. Note that TensorFlow is difficult to set up on Windows (but fairly straightforward to install on Linux or OS X). Instructions for installing TensorFlow and/or running it on CDF are here.
Resources
Geoffrey Hinton's Coursera course contains great explanations for the intution behind neural networks.
Deep Learning by Yoshua Bengio, Ian Goodfellow, and Aaron Courville is an advanced textbook with good coverage of deep learning and a brief introduction to machine learning.
Learning Deep Architectures for AI by Yoshua Bengio contains an indepth tutorial on learning RBMs.
Pattern Recognition and Machine Learning by Christopher M. Bishop is a very detailed and thorough book on the foundations of machine learning. A good textbook to buy to have as a reference for this and future machine learning courses, though it's not required.
The Elements of Statistical Learning by Trevor Hastie, Robert Tibshirani, and Jerome Friedman is also an excellent reference book, available on the web for free at the link.
The CS229 Lecture Notes by Andrew Ng are a concise introduction to machine learning.
Andrew Ng's Coursera course contains excellent explanations.
Pedro Domnigos's Coursera course is a more advanced course.
CS231n: Convolutional Neural Networks for Visual Recognition at Stanford (archived 2015 version) is an amazing advanced course, taught by FeiFei Li and Andrej Karpathy (a UofT alum). The course website contains a wealth of materials.
CS224d: Deep Learning for Natural Language Processing at Stanford, taught by Richard Socher. CS231, but for NLP rather than vision. More details on RNNs are given here.
Python Scientific Lecture Notes by Valentin Haenel, Emmanuelle Gouillart, and Gaël Varoquaux (eds) contains material on NumPy and working with image data in SciPy. (Free on the web.)
Evaluation
The better of:
35%: ProjectsOr:
30%: Midterm
35%: Final Exam
35%: Projects
10%: Midterm
55%: Final Exam
You must receive at least 40% on the final exam to pass the course.
Projects
A sample report/LaTeX template containing advice on how to write project reports AI courses is here (see the pdf and tex files). (The example is based on Programming Computer Vision, pp2730.) Key points: your code should generate all the figures used in the report; describe and analyze the inputs and the outputs; add your interpretation where feasible.
Project 1 (7%): Face Recognition and Gender Classification using kNN (due: Feb. 3 at 10PM)
Project 2 (10%): Handwritten Digit Recognition with Neural Networks (due: Feb.222628 at 10PM, no late submissions after Feb. 29 10PM)
Project 3 (10%): Convolutional Neural Networks and Transfer Learning in TensorFlow (due: Mar. 21 at 10PM, Mar. 24 at 10PM for the bonus)
Project 4 (8%): Fun with RNNs (due: Apr. 4 at 10PM, Apr. 7 at 10PM for the bonus)
Lateness penalty: 10% of the possible marks per day, rounded up (4% for Projects 3 and 4). Projects are only accepted up to 72 hours (3 days) after the deadline.
Exam
2016 exam paper
Midterm
March 4, 4pm6pm, BA1180 (even student numbers) and 1190 (odd student numbers). (Makeup midterm for those who have a documented (a screenshot and/or explanatory email is sufficient) conflict with the main timeslot: 6pm8pm on the same day, location TBA.).
Coverage: the lectures and the projects, focusing on the lectures.
Lecture notes
Coming up: introduction to Numpy/SciPy, kNearest Neighbours, linear regression and gradient descent. See Andrew Ng's coursera course Weeks 1 and 2, Notes, part 1 from CS229, and Friedman et al. 2.12.3. Gradients: Understanding the Gradient, Understanding Pythagorean Distance and the Gradient
Week 1: Welcome to CSC321, numpy_demo.py (bieber.jpg), K Nearest Neighbours, Linear Regression.
Why is it that you can only barely see see the maple leaf on the red channel of the Bieber photo? Because the red channel is close to 1 for the red flag, so it shows as white when viewing just the red channel of the photo.
Q in lecture: are there a lot of examples of functions with deep and narrow local minima that are easy to miss? A: Yes, all over the place. For example, we can convert any NPhard problem into such a function (otherwise it woudn't be NPhard!). See here for an example of a construction of such a function. In some scenarios, neural networks can be another example. (Although much of the time, they are not.) More on this later.
Week 2: More numpy: Intro to vectorization. Visualizing functions of two variables: surface plots, contour plots, and heatmaps. Understanding gradient descent: on the board. Implementing gradient descent: in one variable, in multiple variables.
Thinko in lecture: see here for the real story of how to find a vector that points in the direction of steepest ascent in 3D.
Multiple linear regression, linear classifiers
Maximum likelihood (on the board).
The Home Depot Challenge on Kaggle.
Coming up: Logistic Regression, intro to Bayesian inference, multilayer Neural Networks. See lectures VI and VIIIX from Andrew Ng's course and the Neural Networks lecture from Pedro Domingos's course.
Recommended lectures from Prof. Hinton's Coursera course: Lectures 13.
Week 3: Learning Linear Regression and Logistic Regression models using Maximum Likelihood.
Intro to Bayesian Inference. bayes.py.
Classification Using Multilayer Neural Networks.
Tutorial: Learning linear regression models with Gradient Descent; vectorizing code. Tutorial plan, code, galaxy data (info)
Week 4: Vectorizing neural networks; Backpropagation; OneHot Encoding; activation functions, intro to optimizing neural networks. (UPDATED Feb. 4)
Coming up: learning features with multilayer neural networks (reading/viewing: Lecture 5 from Hinton's course); generalization and overfitting (reading/viewing: Lecture 9 from Hinton's course; Ch. 7 of Deep Learning); momentum and learning rates (Lecture 6 from Hinton's course; Ch. 8 of Deep Learning); Convolutional Neural Networks (Lecture 5 from Hinton's course; Ch. 9 of Deep Learning); Recurrent Neural Networks (Lecture 7 from Hinton's course; Ch. 10 of Deep Learning.)
Week 5: How Neural Networks See. Overfitting; preventing overfitting.
Some more weight matrix images (obtained by training on 64x64 images).
Just for fun: The Dead Salmon Study; steep learning curves on Yann LeCun's website.
Tutorial: Backpropagation and gradient flow. Slides, code. Please do not print out the slides before the tutorial: the idea is for you to work most of those things out in tutorial, and printing out the slides defeats the point.
Vectorization extra review: Recap of gradients for linear regression, computing the gradients using loops, Tutorial 1 recap: first try at vectorization, a nicer vectorization scheme that doesn't use sum(). Slides, code.
Week 6: warpup of the regularization lecture; a bit about neuroscience; introduction to convolutional neural networks; ConvNet architectures.
Just for fun: Hubel and Wiesel's research, Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition
Week 7: Understanding How ConvNets See, Deep Dream, Neural Style
Just for fun: Deep Dream Grocery Trip, Neural Style demo.
Just for fun: The Matthew Effect
Week 8: midterm takeup;Intro to the handout TensorFlow code; Recurrent Neural Networks
Just for fun: Donald Trump RNN; The Unreasonable Effectiveness of RNN
Week 9: RNNs and Vanishing Gradients: Part 2 (one_layer.py); Solving the Vanishing Gradients Problem with GRU;LSTM; Machine Translation with LSTM
Tutorial (Week 9/10 for afternoon section, Week 10 for evening section): the gradient in mincharrnn.py, linebyline. Solutions (SPOILERS: don't look before attempting at least the first few lines) here. Handout here.
Coming up: More on Baysian learning (see Lectures 9 and 10 in the Hinton coursera course and Radford Neal's tutorial); Markov Chain Monte Carlo (see the beginning of this review); Restricted Botlzmann Machines (See Section 5 of Learning Deep Architectures for AI). For RBMs, see also Lectures 1114 from Hinton's coursera course for a somewhat different approach than what we're doing. Note that all those readings are more complete, but also more advanced that what we do in class. You are of course only responsible for what's done in class.
Week 10: Improving Learning in Neural Networks; Intro to Bayesian Inference and Markov Chain Monte Carlo (MCMC).
Week 11: Markov Chain Monte Carlo (MCMC)
Coming up: Wrapping up RBMs; a little bit of autoencoders (see the beginning of Lecture 15 in the Coursera course); Dropout
Week 12: Training RBMs
Inlecture minitutorial: Metropolis Algorithm (code, handout, visualization)
MarkUs
All project submission will be done electronically, using the MarkUs system. You can log in to MarkUs using your CDF login and password.
To submit as a group, one of you needs to "invite" the other to be partners, and then the other student needs to accept the invitation. To invite a partner, navigate to the appropriate Assignment page, find "Group Information", and click on "Invite". You will be prompted for the other student's CDF user name; enter it. To accept an invitation, find "Group Information" on the Assignment page, find the invitation listed there, and click on "Join". Only one student must invite the other: if both students send an invitation, then neither of you will be able to accept the other's invitation. So make sure to agree beforehand on who will send the invitation! Also, remember that, when working in a group, only one person must submit solutions.
To submit your work, again navigate to the appropriate Exercise or Assignment page, then click on the "Submissions" tab near the top. Click "Add a New File" and either type a file name or use the "Browse" button to choose one. Then click "Submit". You can submit a new version of any file at any time (though the lateness penalty applies if you submit after the deadline) — look in the "Replace" column. For the purposes of determining the lateness penalty, the submission time is considered to be the time of your latest submission.
Once you have submitted, click on the file's name to check that you have submitted the correct version.
LaTeX
Webbased LaTeX interfaces: WriteLaTeX, ShareLaTeX
TeXworks, a crossplatform LaTeX frontend. To use it, install MikTeX on Windows and MacTeX on Mac
Detexify^{2}  LaTeX symbol classifier
The LaTeX Wikibook.
Additional LaTeX Documentation, from the home page of the LaTeX Project.
Policy on special consideration
Special consideration will be given in cases of documented and serious medical and personal cirumstances.