I am a Ph.D. student in Computer Science at University of Toronto, where I am fortunate to be advised by Chris Maddison and Jimmy Ba. Currently, I am also a visiting scholar at Stanford University, hosted by Tatsunori Hashimoto.
Previously, I was a student researcher at Google Research and a research intern at Microsoft Research. In summer 2019, I was a visiting student at UCLA, where I worked with Cho-Jui Hsieh. I obtained my Bachelor degree in Information Engineering from Zhejiang University.
Research
My research focuses on the scaling, evaluation, and alignment of language models and agents, especially
as they approach or exceed super-human performance levels.
Collaboration opportunities:
I am always open to discussion on research ideas and collaboration. If you are a student at UofT with interests in language models, agents, AI safety or other related topics, please do not hesitate to reach out to me!
* below denotes equal contribution
Observational Scaling Laws and the Predictability of Language Model Performance
Yangjun Ruan,
Chris J Maddison,
and Tatsunori Hashimoto
In Advances in Neural Information Processing Systems
(NeurIPS),
2024
[Spotlight]
TL;DR: We introduce observational scaling laws that unify a large set of public LMs in a shared capability space, enabling a low-cost, high-resolution, and broad-coverage scaling analysis for complex LM capabilities
Graph-based Uncertainty Metrics for Long-form Language Model Outputs
Mingjian Jiang,
Yangjun Ruan,
Prasanna Sattigeri,
Salim Roukos,
and Tatsunori Hashimoto
In Advances in Neural Information Processing Systems
(NeurIPS),
2024
[Spotlight]
TL;DR: We introduce a family of graph-based uncertainty metrics for long-form LLM generations, and demonstrate consistent gains over existing methods.
-
Weighted Ensemble Self-Supervised Learning
Yangjun Ruan,
Saurabh Singh,
Warren Morningstar,
Alexander A. Alemi,
Sergey Ioffe,
Ian Fischer,
and Joshua V. Dillon
In International Conference on Learning Representations
(ICLR),
2023
TL;DR: An efficient training-time ensemble method for improving self-supervised representation learning, achieving SOTA results on ImageNet SSL & few-shot benchmarks.
Optimal Representations for Covariate Shift
Yangjun Ruan*,
Yann Dubois*,
and Chris J Maddison
In International Conference on Learning Representations
(ICLR),
2022
TL;DR: We derive a self-supervised objective for learning optimally robust representations under covariate shift, offering insights into CLIP’s robustness and further enhancing its distributional robustness.
Improving Lossless Compression Rates via Monte Carlo Bits-Back Coding
Yangjun Ruan*,
Karen Ullrich*,
Daniel Severo*,
James Townsend,
Ashish Khisti,
Arnaud Doucet,
Alireza Makhzani,
and Chris J Maddison
In International Conference on Machine Learning
(ICML),
2021
[Oral]
TL;DR: We introduce the Monte Carlo bits-back coding framework for deriving asymptotically optimal compression algorithms from tighter variational bounds.
Services
- Conference reviewer: NeurIPS (2020-), ICLR (2021-), ICML (2021-)
- Workshop reviewer: NeurIPS Workshop on DGMs Applications (2021), ICML Workshop on Pretraining (2022)
Selected Awards & Honors
- Ontario Graduate Scholarship, 2023
- DiDi Gruduate Student Award, 2021
- CHU Kochen Scholarship (highest honor at Zhejiang University), 2019.
- Cross-disciplinary Scholars in Science and Technology (CSST), UCLA, 2019.
- National Scholarship (top 1.5%), 2017, 2018, 2019.
- Meritorious Winner, Interdisciplinary Contest in Modeling (ICM), 2018.