Chuning Li

I am a PhD student in Computer Science at the University of Toronto and the Vector Institute, advised by Prof. Chris J. Maddison.

I am interested in understanding the training dynamics of neural networks, especially that of Large Language Models (LLMs). My goal is to develop a scientific model of LLM training that is both faithful and simple, so that it is broadly accessible while still practically useful.

As a first step, my advisor Chris and I developed a linear-regression-like surrogate model that quantitatively captures LLM training behaviour. The model reliably predicts how LLM performance responds to changes in training configurations. A fully implemented version of this model is available in the GitHub repository Predicting Large Model Test Losses with a Noisy Quadratic System.

Figure from Predicting Large Model Test Losses with a Noisy Quadratic System
ICML 2026

Predicting Large Model Test Losses with a Noisy Quadratic System

Chuning Li and Chris J. Maddison

International Conference on Machine Learning (ICML), 2026.

Figure from The Shaped Transformer
NeurIPS 2023

The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit

Lorenzo Noci*, Chuning Li*, Mufan Bill Li, Bobby He, Thomas Hofmann, Chris Maddison, and Daniel M. Roy

Advances in Neural Information Processing Systems (NeurIPS), 2023. *Equal contribution.

Painting titled The Umbrella Maker
The Umbrella Maker Acrylic painting
Painting titled The Musician
The Musician Acrylic painting