* below denotes equal contribution
2024
Observational Scaling Laws and the Predictability of Language Model Performance
Yangjun Ruan,
Chris J Maddison,
and Tatsunori Hashimoto
In Advances in Neural Information Processing Systems
(NeurIPS),
2024
[Spotlight]
TL;DR: We introduce observational scaling laws that unify a large set of public LMs in a shared capability space, enabling a low-cost, high-resolution, and broad-coverage scaling analysis for complex LM capabilities
Graph-based Uncertainty Metrics for Long-form Language Model Outputs
Mingjian Jiang,
Yangjun Ruan,
Prasanna Sattigeri,
Salim Roukos,
and Tatsunori Hashimoto
In Advances in Neural Information Processing Systems
(NeurIPS),
2024
[Spotlight]
TL;DR: We introduce a family of graph-based uncertainty metrics for long-form LLM generations, and demonstrate consistent gains over existing methods.
-
2023
Calibrating Language Models via Augmented Prompt Ensembles
Mingjian Jiang*,
Yangjun Ruan*,
Sicong Huang,
Saifei Liao,
Silviu Pitis,
Roger Baker Grosse,
and Jimmy Ba
ICML Workshop on Deployment Challenges for Generative AI,
2023
TL;DR: A prompt-augmented ensemble method for calibrating LLMs that can be generalized to open-ended generations.
Weighted Ensemble Self-Supervised Learning
Yangjun Ruan,
Saurabh Singh,
Warren Morningstar,
Alexander A. Alemi,
Sergey Ioffe,
Ian Fischer,
and Joshua V. Dillon
In International Conference on Learning Representations
(ICLR),
2023
TL;DR: An efficient training-time ensemble method for improving self-supervised representation learning, achieving SOTA results on ImageNet SSL & few-shot benchmarks.
2022
Augment with Care: Contrastive Learning for the Boolean Satisfiability Problem
Haonan Duan*,
Pashootan Vaezipoor*,
Max B Paulus,
Yangjun Ruan,
and Chris J Maddison
In International Conference on Machine Learning
(ICML),
2022
TL;DR: A label-efficient contrastive pre-training method for combinatorial optimization.
Optimal Representations for Covariate Shift
Yangjun Ruan*,
Yann Dubois*,
and Chris J Maddison
In International Conference on Learning Representations
(ICLR),
2022
TL;DR: We derive a self-supervised objective for learning optimally robust representations under covariate shift, offering insights into CLIP’s robustness and further enhancing its distributional robustness.
2021
Improving Lossless Compression Rates via Monte Carlo Bits-Back Coding
Yangjun Ruan*,
Karen Ullrich*,
Daniel Severo*,
James Townsend,
Ashish Khisti,
Arnaud Doucet,
Alireza Makhzani,
and Chris J Maddison
In International Conference on Machine Learning
(ICML),
2021
[Oral]
TL;DR: We introduce the Monte Carlo bits-back coding framework for deriving asymptotically optimal compression algorithms from tighter variational bounds.
2020
Learning to Learn by Zeroth-Order Oracle
Yangjun Ruan,
Yuanhao Xiong,
Sashank Reddi,
Sanjiv Kumar,
and Cho-Jui Hsieh
In International Conference on Learning Representations
(ICLR),
2020
TL;DR: A meta-learned zeroth-order optimizer that outperforms hand-designed algorithms.
2019
FastSpeech: Fast, Robust and Controllable Text to Speech
Yi Ren*,
Yangjun Ruan*,
Xu Tan,
Tao Qin,
Sheng Zhao,
Zhou Zhao,
and Tie-Yan Liu
In Advances in Neural Information Processing Systems
(NeurIPS),
2019
TL;DR: A non-autoregressive Transformer-based text-to-speech model that improves inference speed by 270x and enables controllable speech synthesis.
Data transmission in mobile edge networks: Whether and where to compress?
Jinke Ren*,
Yangjun Ruan*,
and Guanding Yu
IEEE Communications Letters,
2019
TL;DR: An analysis of the optimal compression ratio for minimizing end-to-end latency in mobile edge networks.