Assistant Professor, University of Toronto
CIFAR AI Chair, Vector Institute
Email, Google Scholar, Bluesky, GitHub
I develop machine learning algorithms with the long-term goal of advancing applications in the natural sciences. I am typically focused on improving our algorithmic tools, but occasionally I enjoy collaborating on large-scale applied projects. I publish mostly at machine learning conferences (NeurIPS, ICML, ICLR). My publications can be found on my Google Scholar, and here is a brief biography of my work.
The success of large language models is driven by the abundance and natural structure of data. What does this tell us about our universe and ourselves? How can we use these insights to advance applications in other domains? I am interested in understanding how the statistical structure of real-world data influences the emergence of capabilities in AIs as they train on vast, heterogeneous datasets.