Roger Grosse
Home
Group
Publications
Teaching
Publications
Stephen Zhao, Rob Brekelmans, Alireza Makhzani, and Roger Grosse.
Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo
. ICML 2024
Nathan Ng, Roger Grosse, and Marzyeh Ghassemi.
Measuring Stochastic Data Complexity with Boltzmann Influence Functions
. ICML 2024.
Jin Peng Zhou, Yuhuai Wu, Qiyang Li, and Roger Grosse.
REFACTOR: Learning to Extract Theorems from Proofs
. ICLR 2024
Caspar Oesterheld, Johannes Treutlein, Roger Grosse, Vincent Conitzer, and Jakob Foerster.
Similarity-Based Cooperative Equilibrium
. NeurIPS 2023.
Nikita Dhawan, Sicong Huang, Juhan Bae, and Roger Grosse.
Efficient Parametric Approximations of Neural Network Function Space Distance.
. ICML 2023.
Juhan Bae, Michael R. Zhang, Michael Ruan, Eric Wang, So Hasegawa, Jimmy Ba, and Roger Grosse.
Multi-Rate VAE: Train Once, Get the Full Rate-Distortion Curve
. ICLR 2023.
Stephen Zhao, Chris Lu, Roger Grosse, and Jakob Foerster.
Proximal learning with opponent learning awareness
. NeurIPS 2022.
Juhan Bae, Paul Vicol, Jeff Z. HaoChen, and Roger Grosse.
Amortized proximal optimization
. NeurIPS 2022.
Juhan Bae, Nathan Ng, Alston Lo, Marzyeh Ghassemi, and Roger Grosse.
If influence functions are the answer, then what is the question?
NeurIPS 2022.
Cem Anil, Ashwini Pokle, Kaiqu Liang, Johannes Treutlein, Yuhuai Wu, Shaojie Bai, J. Zico Kolter, and Roger Grosse.
Path independent equilibrium networks can better exploit test-time computation
. NeurIPS 2022.
Paul Vicol, Jonathan Lorraine, Fabian Pedregosa, David Duvenaud, and Roger Grosse.
On implicit bias in overparameterized bilevel optimization
. ICML 2022.
Rob Brekelmans, Sicong Huang, Marzyeh Ghassemi, Greg ver Steeg, Roger Grosse, and Alireza Makhzani.
Improving Mutual Information Estimation with Annealed and Energy-Based Bounds
. ICLR 2022.
Guodong Zhang, Kyle Hsu, Jianing Li, Chelsea Finn, and Roger Grosse.
Differentiable Annealed Importance Sampling and the Perils of Gradient Noise
. NeurIPS 2021.
Shengyang Sun, Jiaxin Shi, Andrew Gordon Wilson, and Roger Grosse.
Scalable Variational Gaussian Processes via Harmonic Kernel Decomposition
. ICML 2021.
[Code]
James Lucas, Juhan Bae, Michael Zhang, Stanislav Fort, Richard Zemel, and Roger Grosse.
Analyzing Monotonic Linear Interpolation in Neural Network Loss Landscapes
. ICML 2021.
Yuhuai Wu, Markus Rabe, Wenda Li, Jimmy Ba, Roger Grosse, and Christian Szegedy.
LIME: Learning inductive bias for primitives of mathematical reasoning
. ICML 2021.
Guodong Zhang, Xuchan Bao, Laurent Lessard, and Roger Grosse.
A unified analysis of first-order methods for smooth games via integral quadratic constraints
. JMLR 2021.
[Code]
Chaoqi Wang, Shengyang Sun, and Roger Grosse.
Beyond marginal uncertainty: How accurately can Bayesian regression models estimate posterior predictive correlations?
AISTATS 2021.
[Code]
Jens Behrmann, Paul Vicol, Kuan-Chieh Wang, Roger Grosse, and Jorn-Henrik Jacobsen.
Understanding and mitigating exploding inverses in invertible neural networks
. AISTATS 2021.
[Code]
Yuhuai Wu, Albert Jiang, Jimmy Ba, and Roger Grosse.
INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving
. ICLR 2021.
[Code]
Shun-ichi Amari, Jimmy Ba, Roger Grosse, Xuechen Li, Atsushi Nitanda, Taiji Suzuku, Denny Wu, and Ji Xu.
When does preconditioning help or hurt generalization?
ICLR 2021.
Pashootan Vaezipoor, Gil Lederman, Yuhuai Wu, Chris J. Maddison, Roger Grosse, Edward Lee, Sanjit A. Seshia, and Fahiem Bacchus.
Learning Branching Heuristics for Propositional Model Counting
. AAAI 2021.
Juhan Bae and Roger Grosse.
Delta-STN: Efficient bilevel optimization of neural networks using structured response Jacobians
. NeurIPS 2020.
[Code]
Xuchan Bao, James Lucas, Sushant Sachdeva, and Roger Grosse.
Regularized linear autoencoders recover the principal components, eventually
. NeurIPS 2020.
[Code]
Sheldon Huang, Alireza Makhzani, Yanshuai Cao, and Roger Grosse.
Evaluating lossy compression rates of deep generative models
. ICML 2020.
[Code]
Chaoqi Wang, Guodong Zhang, and Roger Grosse.
Picking winning tickets before training by preserving gradient flow
. ICLR 2020.
[Code]
Guodong Zhang, Lala Li, Zachary Nado, James Martens, Sushant Sachdeva, George E. Dahl, Christopher J. Shallue, and Roger Grosse.
Which algorithmic choices matter at which batch sizes? Insights from a noisy quadratic model
. NeurIPS 2019.
[Code]
Guodong Zhang, James Martens, and Roger Grosse.
Fast convergence of natural gradient descent for overparameterized neural networks
. NeurIPS 2019.
James Lucas, George Tucker, Roger Grosse, and Mohammad Norouzi.
Don’t blame the ELBO! A linear VAE perspective on posterior collapse
. NeurIPS 2019.
[Code]
Qiyang Li, Saminul Haque, Cem Anil, James Lucas, Roger Grosse, and Jorn-Henrik Jacobsen.
Preventing gradient attenuation in Lipschitz-constrained convolutional networks
. NeurIPS 2019.
[Code]
Cem Anil, James Lucas, and Roger Grosse.
Sorting out Lipschitz function approximation
. ICML 2019.
[Code]
Chaoqi Wang, Roger Grosse, Sanja Fidler, and Guodong Zhang.
EigenDamage: structured pruning in the Kronecker-factored eigenbasis
. ICML 2019.
[Code]
Matthew MacKay, Paul Vicol, Jonathan Lorraine, David Duvenaud, and Roger Grosse.
Self-tuning networks: bilevel optimization of hyperparameters using structured best-response functions
. ICLR 2019.
[Code]
Shengyang Sun, Guodong Zhang, Jiaxin Shi, and Roger Grosse.
Functional variational Bayesian neural networks
. ICLR 2019.
[Code]
Sheldon Huang, Qiyang Li, Cem Anil, Xuchan Bao, Sageev Oore, and Roger Grosse.
TimbreTron: A WaveNet ( CycleGAN ( CQT ( audio ))) pipeline for musical timbre transfer
. ICLR 2019.
Guodong Zhang, Chaoqi Wang, Bowen Xu, and Roger Grosse.
Three mechanisms of weight decay regularization
. ICLR 2019.
[Code]
James Lucas, Shengyang Sun, Richard Zemel, and Roger Grosse.
Aggregated momentum: stability through passive damping
. ICLR 2019.
[Code]
Matthew MacKay, Paul Vicol, Jimmy Ba, and Roger Grosse.
Reversible recurrent neural networks
. NIPS 2018.
[Code]
Tian Qi Chen, Xuechen Li, Roger Grosse, and David Duvenaud.
Isolating sources of disentanglement in variational autoencoders
. NIPS 2018.
[Code]
Shengyang Sun, Guodong Zhang, Chaoqi Wang, Wenyuan Zeng, Jiaman Li, and Roger Grosse.
Differentiable compositional kernel learning for Gaussian processes
. ICML 2018.
[Code]
Guodong Zhang, Shengyang Sun, David Duvenaud, and Roger Grosse.
Noisy natural gradient as variational inference
. ICML 2018. [Code:
1
,
2
]
Kuan-Chieh Wang, Paul Vicol, James Lucas, Li Gu, Roger Grosse, and Richard Zemel.
Adversarial distillation of Bayesian neural network posteriors
. ICML 2018.
[Code]
Yeming Wen, Paul Vicol, Jimmy Ba, Dustin Tran, and Roger Grosse.
Flipout: Efficient Pseudo-Independent Weight Perturbations on Mini-Batches
. ICLR 2018.
Yuhuai Wu, Mengye Ren, Renjie Liao, and Roger Grosse.
Understanding short-horizon bias in stochastic meta-optimization
. ICLR 2018.
Yuhuai Wu, Elman Mansimov, Shun Liao, Roger Grosse, and Jimmy Ba.
Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
. NIPS 2017.
code
Aidan Gomez, Mengye Ren, Raquel Urtasun, and Roger Grosse.
The Reversible Residual Network: Backpropagation Without Storing Activations
. NIPS 2017.
code
Jacob Gardner, Chuan Guo, Kilian Weinberger, Roman Garnett, and Roger Grosse.
Discovering and exploiting additive structure for Bayesian optimization
. AISTATS 2017.
Jimmy Ba, Roger Grosse, and James Martens.
Distributed second-order optimization using Kronecker-factored approximations
. ICLR 2017.
Yuhuai Wu, Yuri Burda, Ruslan Salakhutdinov, and Roger Grosse.
On the quantitative analysis of decoder-based generative models.
. ICLR 2017.
code
Roger Grosse, Siddharth Ancha, and Daniel Roy.
Measuring the reliability of MCMC inference with bidirectional Monte Carlo
. NIPS 2016.
arXiv
Roger Grosse and James Martens.
A Kronecker-factored approximate Fisher matrix for convolution layers
. ICML 2016.
ICML version
Yuri Burda, Roger Grosse, and Ruslan Salakhutdinov.
Importance weighted autoencoders
. ICLR 2016.
code
Jimmy Ba, Roger Grosse, Ruslan Salakhutdinov, and Brendan Frey.
Learning wake-sleep recurrent attention models
. NIPS 2015.
NIPS version
Roger Grosse and Ruslan Salakhutdinov.
Scaling up natural gradient by sparsely factorizing the inverse Fisher matrix
. ICML 2015.
code
James Martens and Roger Grosse.
Optimizing Neural Networks with Kronecker-factored Approximate Curvature
. ICML 2015.
ICML version
, and
appendix
(terser and less readable than the arXiv version)
Yuri Burda, Roger B. Grosse, and Ruslan Salakhutdinov.
Accurate and conservative estimates of MRF log-likelihood using reverse annealing
. AISTATS 2015.
James R. Lloyd, David Duvenaud, Roger B. Grosse, Joshua B. Tenenbaum, and Zoubin Ghahramani.
Automatic construction and natural-language description of nonparametric regression models
. AAAI 2014.
code
examples
Roger B. Grosse, Chris J. Maddison, and Ruslan Salakhutdinov.
Annealing between distributions by averaging moments
. NIPS 2013.
supplemental material
preprint
(from the ICML 2013 workshop
Challenges in Representation Learning
)
background
David Duvenaud, James R. Lloyd, Roger B. Grosse, Joshua B. Tenenbaum, and Zoubin Ghahramani.
Structure discovery in nonparametric regression through compositional kernel search
. ICML 2013.
code
background
Roger B. Grosse, Ruslan Salakhutdinov, William T. Freeman, and Joshua B. Tenenbaum.
Exploiting compositionality to explore a large space of model structures
. UAI 2012.
Best Student Paper.
code
background
Honglak Lee, Roger Grosse, Rajesh Ranganath, and Andrew Y. Ng.
Unsupervised learning of hierarchical representations with convolutional deep belief networks
.
Communications of the ACM
, vol. 54, no. 10, pp. 95-103, 2011.
Roger Grosse, Micah K. Johnson, Edward Adelson, and William T. Freeman.
A ground-truth dataset and baseline evaluations for intrinsic image algorithms
. ICCV 2009.
project page
Honglak Lee, Roger Grosse, Rajesh Ranganath, and Andrew Y. Ng.
Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations
. ICML 2009.
Best Application Paper
Roger Grosse, Rajat Raina, Helen Kwong, and Andrew Y. Ng.
Shift-invariant sparse coding for audio classification
. UAI 2007
code
Preprints
Roger Grosse, Juhan Bae, Cem Anil, et al., 2023.
Studying Large Language Model Generalization with Influence Functions
.
Cem Anil, Guodong Zhang, Yuhuai Wu, and Roger Grosse, 2021.
Learning to Give Checkable Answers with Prover-Verifier Games
.
Yuhuai Wu, Honghua Dong, Roger Grosse, and Jimmy Ba, 2020.
The Scattering Compositional Learner: Discovering Objects, Attributes, Relationships in Analogical Reasoning
.
Kevin Luk and Roger Grosse, 2018.
A coordinate-free construction of scalable natural gradient
.
Roger Grosse, Zoubin Ghahramani, and Ryan Adams, 2015.
Sandwiching the marginal likelihood using bidirectional Monte Carlo
.
Thesis
Model selection in compositional spaces
. Ph.D. thesis, 2014.