Yuhuai(Tony) Wu

I am a Senior Research Scientist at Google in the N2Formal team led by Christian Szegedy.
I am a Postdoctoral Scholar at Stanford, mentored by Percy Liang and Jay McClelland.
During my PhD at U of Toronto, I was advised by Roger Grosse and Jimmy Ba.
You can find my CV here (Last updated July 4, 2022).


Research Interests

My primary research interest is building machines that can reason.

I have chosen mathematics as a starting point to study reasoning, with the aim of creating an automated mathematician.

I am interested in improving neural architectures for reasoning, as well as building human-like reasoning mechanisms into the model.

Current Students / Interns

Albert Jiang (PhD student at Cambridge)
Cem Anil (PhD student at UofT)
Eric Zelikman (PhD student at Stanford)
Felix Li (Undergraduate student at UC Berkeley)
Jin Zhou (PhD student at Cornell)
Qian Huang (PhD student at Stanford)
Szymon Tworkowski (Master student at Univ. of Warsaw)
Maciej MikuĊ‚a (Master student at Univ. of Warsaw)

Past Students

Ethan Chi (Master student at Stanford)
Honghua Dong (PhD student at UofT)
Imanol Schlag (PhD student at IDSIA)
Qiyang (Colin) Li (PhD student at UC Berkeley)


Nov. 2022

Releasing Draft, Sketch, and Prove: Autoformalize the entire natural language proofs [arxiv]!

I gave a talk on autoformalization at FLAIM conference.

I gave a guest lecture on autoformalization at UIUC proof automation class.

Sept. 2022

8 papers accepted to NeurIPS 2022.

Our length generalization paper is accepted as an Oral Presentation at NeurIPS 2022.

We are organizing the second MATHAI workshop at NeurIPS 2022.

I gave a talk at AITP 2022.

June 2022

Releasing Minerva: a language model that solves MATH with 51% acc, which was predicted to happen in 2025! See [arXiv][Google AI Blog][Sample Explorer].

Sharing a systematic study on synthetic pre-training [arXiv]. Understanding pre-training via synthetic tasks!

I gave a talk at the University of Cambridge [Link].

I gave talk at UC Berkeley Center for Human-Compatible AI (CHAI).

I gave a talk at Covariant.ai.

May 2022

We used LLMs to turn natural language mathematics into formal specifications [arXiv], and achieved SOTA on miniF2F. See media coverage on NewScientist!

We released Thor [arXiv]. Integrate symbolic tools to neural theorem provers for premise selection!

We released a stronger version of Subgoal search. Introduce Adaptive Subgoal Search (AdaSubS) [arXiv]: improve search with transformers by variable planning horizons.

March 2022

We released STaR [arXiv]. Bootstrapping Reasoning with Reasoning!

We released Block-Recurrent Transformer [arXiv]. Recurrence is coming back!

Gave a talk at the University of Oxford.

Gave a talk at Harvard University.

Jan 2022

Memorizing Transformers accepted as a spotlight presentation at ICLR 2022.

Three papers accepted to ICLR 2022.

Dec 2021

Subgoal search algorithm accepted to NeurIPS 2021.

Co-organized the MATHAI4ED workshop at NeurIPS 2021: Math AI for education: Bridging the gap between research and smart education.

Aug 2021

Led the Reasoning section in the Foundation Model white paper.

Jul 2021

Two posters in ICML 2021.

Apr 2021

Two posters in ICLR 2021.

    Selected Publications [Full List]

  1. NeurIPS'22

    Minerva: Solving Quantitative Reasoning Problems with Language Models

    Aitor Lewkowycz, Anders Andreassen, David Dohan, Ethan Dyer, Henryk Michalewski,
    Vinay Ramasesh, Ambrose Slone, Cem Anil, Imanol Schlag, Theo Gutman-Solo,
    Yuhuai Wu, Behnam Neyshabur, Guy Gur-Ari, Vedant Misra

    NeurIPS, 2022.

    PDF Google AI Blog
  2. NeurIPS


    Exploring Length Generalization in Large Language Models

    Cem Anil, Yuhuai Wu, Anders Andreassen, Aitor Lewkowycz, Vedant Misra,
    Vinay Ramasesh, Ambrose Slone, Guy Gur-Ari, Ethan Dyer, Behnam Neyshabur

    The 36th Conference on Neural Information Processing Systems, 2022.

  3. NeurIPS'22

    Insights into Pre-training via Simpler Synthetic Tasks

    Yuhuai Wu*, Felix Li*, Percy Liang

    NeurIPS, 2022.

  4. NeurIPS'22

    Autoformalization with Large Language Models

    Yuhuai Wu*, Albert Q. Jiang, Wenda Li, Markus Rabe, Charles Staats, Mateja Jamnik, Christian Szegedy

    NeurIPS, 2022.

    PDF Interview with NewScientist
  5. NeurIPS'22

    STaR: Bootstrapping Reasoning With Reasoning

    Eric Zelikman*, Yuhuai Wu*, Noah D. Goodman

    arXiv, 2022.

  6. NeurIPS'22

    Block-Recurrent Transformers.

    DeLesley Hutchins*, Imanol Schlag*, Yuhuai Wu, Ethan Dyer, Behnam Neyshabur

    arXiv, 2022.

  7. ICLR'22


    Memorizing Transformers.

    Yuhuai Wu, Markus Rabe, DeLesley Hutchins, Christian Szegedy

    The 10th International Conference on Learning Representations, 2022.

    PDF #4 on HackerNews
  8. ICLR'22

    Proof Artifact Co-training for Theorem Proving with Language Model.

    Jesse Michael Han, Jason Rute, Yuhuai Wu, Edward W. Ayers, Stanislas Polu

    The 10th International Conference on Learning Representations, 2022.

  9. NeurIPS'21

    Subgoal Search For Complex Reasoning Tasks.

    Konrad Czechowski, Tomasz Odrzygozdz, Marek Zbysinski, Michal Zawalski,
    Krzysztof Olejnik,Yuhuai Wu, Lukasz Kucinski, Piotr Milos

    The 35th Conference on Neural Information Processing Systems, 2021.

  10. ICML'21

    LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning.

    Yuhuai Wu, Markus Rabe, Wenda Li, Jimmy Ba, Roger Grosse, Christian Szegedy

    The 38th International Conference on Machine Learning, 2021.

  11. ICLR'21

    INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving.

    Yuhuai Wu*, Albert Q. Jiang*, Jimmy Ba, Roger Grosse

    The 9th International Conference on Learning Representations, 2021.

  12. ICLR'21

    Modelling High-Level Mathematical Reasoning in Mechanised Declarative Proofs.

    Wenda Li, Lei Yu, Yuhuai Wu, Lawrence C. Paulson

    The 9th International Conference on Learning Representations, 2021.

  13. arXiv'21

    Learning to Give Checkable Answers with Prover-Verifier Games.

    Cem Anil, Guodong Zhang, Yuhuai Wu, Roger Grosse


  14. arXiv'21

    On the Opportunities and Risks of Foundation Models.

    Rishi Bommasani, Drew A. Hudson, Percy Liang et. al.

  15. ICML'20

    Options as REsponses: Grounding Behavioural Hierarchies in Multi-agent Reinforcement Learning.

    Yuhuai Wu*, Alexander Sasha Vezhnevets*, Maria Eckstein, Remi Leblond, Joel Z. Leibo.

    The 37th International Conference on Machine Learning, 2020.

  16. Nature'19

    Grandmaster Level in StarCraft II using Multi-gent Reinforcement Learning.

    Vinyals, O., Babuschkin, I., Czarnecki, W.M. et al.

    Nature, 2019.

  17. ICLR'18

    Understanding Short-Horizon Bias in Stochastic Meta-Optimization.

    Yuhuai Wu*, Mengye Ren*, Renjie Liao, Roger Grosse

    The 6th International Conference on Learning Representations, 2018.

  18. NeurIPS'17


    Scalable Trust-Region Method for Deep Reinforcement Learning using Kronecker-Factored Approximation.

    Yuhuai Wu*, Elman Mansimov*, Shun Liao, Roger Grosse, Jimmy Ba

    The 31st Annual Conference on Neural Information Processing Systems, 2017

  19. ICLR'17

    On the Quantitative Analysis of Decoder-Based Generative Models

    Yuhuai Wu, Yuri Burda, Ruslan Salakhutdinov, Roger Grosse

    The 5th International Conference on Learning Representations, 2017