CSC 2541 Fall 2016:

Differentiable Inference and Generative Models

Training a GAN [Images synthesized from a GAN]

Overview

In the last few years, new inference methods have allowed big advances in probabilistic generative models. These models let us generate novel images and text, find meaningful latent representations of data, take advantage of large unlabeled datasets, and even let us do analogical reasoning automatically. This course will tour recent innovations in inference methods such as recognition networks, black-box stochastic variational inference, and adversarial autoencoders. It will also cover recent advances in generative model design, such as deconvolutional image models, thought vectors, and recurrent variational autoencoders. The class will have a major project component.

Prerequisites

This course is designed to bring students to the current frontier of knowledge on these methods, so that ideally, their course projects can make a novel contribution. A previous background in machine learning such as CSC411 or ECE521 is strongly recommended. Linear algebra, basic multivariate calculus, basics of working with probability, and programming skills are required.

Where and When

What are generative models?

Generative modeling loosely refers to building a model of data, for instance p(image), that we can sample from. This is in contrast to discriminative modeling, such as regression or classification, which tries to estimate conditional distributions such as p(class | image).

Why generative models?

Even when we're only interested in making predictions, there are practical reasons to build generative models:

Differentiable inference

We already know how to specify some expressive and flexible generative models, including entire languages of models that can express arbitarily complicated structure. However, until recently such models were hard to apply to real datasets, because inference methods (such as Markov chain Monte Carlo methods) were not usually fast or scalable enough to run on large models or even medium-sized datasets.

The past few years have seen major progress in methods to train and do inference in generative models, loosely following four strands:

The common thread among these approaches that lets them scale to high-dimensional models is that their loss functions are end-to-end differentiable. This is in contrast to previous inference strategies such as MCMC or early variational inference strategies, which required alternating inference and optimization steps and didn't allow gradient-based tuning of the inference procedure.

These new inference schemes are allowing great progress in generative models of images and text.

Course Structure

After the first two lectures, each week a different student, or pair of students, will present on an aspect of these methods, using a couple of papers as reference. I'll provide guidance about the content of these presentations.

In-class discussion will center around:

The hope is that these discussions will lead to actual research papers, or resources that will help others understand these approaches.

Grades will be based on:

Project

Students can work on projects individually,in pairs, or even in triplets. The grade will depend on the ideas, how well you present them in the report, how clearly you position your work relative to existing literature, how illuminating your experiments are, and well-supported your conclusions are.

Each group of students will write a short (around 2 pages) research project proposal, which ideally will be structured similarly to a standard paper. It should include a description of a minimum viable project, some nice-to-haves if time allows, and a short review of related work. You don't have to do what your project proposal says - the point of the proposal is mainly to have a plan and to make it easy for me to give you feedback.

Towards the end of the course everyone will present their project in a short, roughly 5 minute, presentation.

At the end of the class you'll hand in a project report (around 4 to 8 pages), ideally in the format of a machine learning conference paper such as NIPS.

Project report grading rubric

Schedule