Picture of me

Geoffrey Roeder

I'm a graduate student in the Machine Learning group at the University of Toronto. I work with Prof. David Duvenaud on fundamental machine learning problems at the intersection of deep learning and Bayesian statistics. Last term, I was part of the teaching staff for CSC411: Introduction to Machine Learning. This term, I'm part of the teaching staff for CSC412: Probabilistic Learning and Reasoning.

I completed my BSc (2016) at the University of British Columbia in Computer Science and Statistics. I spent last summer working in Prof. Mark Schmidt's Machine Learning Lab where I worked mostly on unsupervised learning algorithms for a Matlab machine learning toolbox. I spent the summer before that as a software engineering intern at Arista Networks, where I implemented link layer network protocols for high-performance routers.

I also graduated with a BA (2009) and MA (2011) from the University of British Columbia prior to starting my training as a research scientist. My MA thesis proposed a linguistic framework for analyzing the UN Intergovernmental Panel on Climate Change's transformation of statistical climatology into standardized words and phrases for policy-oriented audiences. This project sparked a strong interest in statistical inference and computer science that led me directly to machine learning.

Curriculum Vitae

Email: roeder@cs.toronto.edu


Research interests

This section to be expanded soon!

I'm interested in unsupervised long and short-term prediction problems using Bayesian inference and deep learning. These are some of the most ambitious and difficult problems in statistical learning, and also some of the most important for applied machine learning. Recent advances like the structured variational autoencoder have shown how rich Bayesian probabilistic reasoning can be alloyed with deep learning to produce data-efficient and computationally tractable learning.

Recent projects

Surface plot depicting problem

Sticking the Landing: A Simple Reduced-Variance Gradient for ADVI

We propose a simple variant of the standard reparameterized gradient estimator for the evidence lower bound that has even lower variance under certain circumstances. Specifically, we decompose the derivative with respect to the variational parameters into two parts: a path derivative and the score function. Removing the second term produces an unbiased gradient estimator whose variance approaches zero as the approximate posterior approaches the exact posterior. We propose that the removed term has arbitrarily high variance when the variational posterior has a complex form, as when using adaptive posteriors such as given by normalizing flows or stochastic Hamiltonian inference.
A short version of the paper was accepted at the NIPS 2016 workshop Advances in Approximate Bayesian Inference.

Paper link

Poster link

Andrew Miller wrote a great blog post exploring the key ideas of the paper.

Manifold to learn with t-SNE

MatLearn: Machine Learning Algorithm Implementations in Matlab

Link to website

I merged multiple code bases from many graduate student contributors into a finished software package, and added a variety of new unsupervised learning algorithms including sparse autoencoders, Hidden Markov Models, Linear-Gaussian State Space Models, t-Distributed Stochastic Neighbour Embedding, and Convolutional Neural Networks for image classification.

Download package

Manifold to learn with t-SNE

Data Visualization Dashboard

Link to website

Using D3.js and open source JS implementations of t-SNE, Principal Component Analysis and Multidimensional Scaling, I built a prototype of an online data visualization dashboard for high-dimensional problems. My goal is to explore the value of high dimensional data visualization for early stages in the machine learning data analysis pipeline, during the early feature extraction and model selection phases.

I propose that iterating through different lower-dimensional representations of a dataset can help a machine learning practitioner identify the right family of models and feature transformations to explore, potentially speeding up the heuristic exploration of model space.

Try it out! Drag one of the following toy datasets (csv) onto the website: Swiss Roll manifold, Fisher's Iris dataset.

Github link

Paper link (unpublished paper, submitted for a CS seminar on Information Visualization in Fall 2016)