Lunjun Zhang

I am a PhD student in the Machine Learning Group at University of Toronto, and a Student Researcher at Google DeepMind, working on Large Language Models.

Previously, I spent two and a half years (from 2021 to 2024) working on autonomous driving at Waabi.

Before that, I did undergrad in Engineering Science at University of Toronto (2017-2021), during which I interned at Vector Institute, Mila, and Uber Advanced Technologies Group.

Contact: Email / Google Scholar / Github / Twitter

Lunjun Zhang


I work on unsupervised learning and reinforcement learning.

I am fascinated by the following questions:


Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion

Lunjun Zhang, Yuwen Xiong, Ze Yang, Sergio Casas, Rui Hu, Raquel Urtasun

International Conference on Learning Representations (ICLR), 2024

[Paper] [Poster] [Proceedings]

Discrete diffusion on tokenized experience can lead to a GPT-like learning paradigm for robotics.

Towards Unsupervised Object Detection from LiDAR Point Clouds

Lunjun Zhang, Anqi Joyce Yang, Yuwen Xiong, Sergio Casas, Bin Yang, Mengye Ren, Raquel Urtasun

Conference on Computer Vision and Pattern Recognition (CVPR), 2023

[Paper] [Proceedings] [Poster] [Website]

Self-supervision combined with object priors can enable scalable object discovery in the wild.

Understanding Hindsight Goal Relabeling from a Divergence Minimization Perspective

Lunjun Zhang, Bradly Stadie

Foundation Models for Decision Making (FMDM) workshop, NeurIPS 2022

Deep Reinforcement Learning (Deep RL) workshop, NeurIPS 2022


Recasting goal-conditioned RL into the imitation learning framework.

World Model as a Graph: Learning Latent Landmarks for Planning

Lunjun Zhang, Ge Yang, Bradly Stadie

International Conference on Machine Learning (ICML), 2021 (Long Talk)

[Paper] [Poster] [Code] [Website]

Learning world models that endow agents with the ability to do temporally extended reasoning.

Learning Intrinsic Rewards as a Bi-level Optimization Problem

Lunjun Zhang, Bradly Stadie, Jimmy Ba

Conference on Uncertainty in Artificial Intelligence (UAI), 2020


Recasting the problem of finding intrinsic rewards as hyper-parameter optimization.