CSC 2541, Fall 2024:

Generative AI for Images

Department of Computer Science

University of Toronto



Announcements


Overview

Generative AI has recently achieved revolutionary performance and burst into public view through such systems as ChatGPT and DALL-E. This course examines the techniques that have made this possible, with an emphasis on machine vision and image synthesis.  Topics will be selected from diffusion models, score matching, normalizing flows, neural differential equations, variational autoencoders, transformers, and large language models. Many of these techniques are mathematically sophisticated.

This is primarily a seminar course in which students read and present papers from the literature, though there may be some supplementary lectures on advanced material. There will also be a major course project. The goal is to bring students to the state of the art in this exciting field.

Prerequisites:  

An advanced course in Machine Learning (such as csc413 or csc2516), especially neural nets, a solid knowledge of linear algebra, the basics of multivariate calculus and probability, and programming skills, especially programming with vectors and matrices (e.g., Numpy). Some knowledge of differential equations would be an asset. Mathematical maturity will be assumed.

Classes:

Instructor:

Teaching Assistants:

Textbook:

There is no required textbook for this course.  However, the following two books contain essential material at the graduate level. Both are available as free pdf downloads for U of T students and faculty:

Course Structure

The course is structured along the same lines as csc2547 that I gave in spring 2022, though the topics and papers covered this year are quite different:

Paper presentations (tentative):

Projects:

Marking Scheme:


Tentative Schedule:  (details to be added)



Student Presentations








  1. U-Net: Convolutional Networks for Biomedical Image Segmentation  The original paper on U-Nets.  (Naomi Kothiyal)
  2. All are Worth Words: A ViT Backbone for Diffusion Models  (Yanting Chen)
  3. Scalable Diffusion Models with Transformers  (Tom Blanchard)
  4. One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale  (Younwoo Choi)
  5. Simple diffusion: End-to-end diffusion for high resolution images  (Daihao Wu)


Project Presentations