I am a final-year Ph.D. student at University of Toronto Machine Learning Group.

I'm advised by
Roger Grosse and
Jimmy Ba.

Email: ywu[at]cs[dot]toronto[dot]edu

Office: 265D, D.L. Pratt Building.

Machine reasoning, theorem proving, modularity, language, neuro-symbolic integration.

I would like to make machine do reasoning.

- I'm co-organizing the first MATH-AI workshop: On The Role of Mathematical Reasoning in General Artificial Intelligence.
Releasing

**LIME**: Designing inductive biases as a form of datasets for mathematical reasoniong! [arxiv]**INT**accepted to ICLR2021: an inequality theorem proving dataset for evaluating generalization! [openreview][arxiv]**IsarStep**accepted to ICLR2021: the largest theorem proving dataset for human-oriented theorem proving. [openreview][arxiv]**Neuro#**accepted to AAAI2021: A neural network #SAT solver that generalizes to problems of much larger sizes, achiving improvements over SOTA by orders of magnitude. [arxiv]Releasing

**SCL**: a neural architecture that discovers compositional structures in analogical reasoning, generalizing to novel analogies. [arxiv]One poster in ICML 2020.

I'm organizing a seminar on Machine Reasoning, including reasoning in theorem proving, natural language understanding, program synthesis: [paper list].

The meeting time is every Wednesday 3-4pm EST, Vector Institute.

Welcome to attend if you're around Toronto.Releasing

**OPRE**: a hierarchical agent that generalizes to novel opponent strategy! [arxiv]I recently finished an internship at Deepmind from June 2018 - April 2019, working on hierarchical reinforcement learning and StarCraft 2.

One poster in NeurIPS 2018.

Two posters in ICLR 2018.

Two posters and two workshop in NIPS 2017.

Releasing

**ACKTR**: a far more sample-efficient reinforcement learning algorithm than TRPO and A2C! [paper][code]. This is also covered by [OpenAI blog]!I'm very honoured to receive the Google PhD fellowship in machine learning!

Our submission to ICLR: On the Quantitative Analysis of Decoder-Based Generative Models [arxiv] is accepted as a poster presentation. Now we are able to quantitatively measure performances of GANs!

One journal paper accepted to appear in Neural Computation!

3 (co)first-authored papers accepted to appear at NIPS 2016!

Proof Artifact Co-training for Theorem Proving with Language Model. Jesse Michael Han, Jason Rute,

__Yuhuai Wu__, Edward W. Ayers, Stanislas Polu. 2021. [arxiv].Nonlinear Invariant Risk Minimization: A Causal Approach. Chaochao Lu,

__Yuhuai Wu__, Jose Miguel Hernandez-Lobato, Bernhard Scholkopf. 2021. [arxiv].The Scattering Compositional Learner: Discovering Objects, Attributes, Relationships in Analogical Reasoning.

__Yuhuai Wu*__, Honghua Dong*, Roger Grosse, Jimmy Ba. 2020. [arxiv].

Grandmaster level in StarCraft II using multi-agent reinforcement learning. Vinyals, O., Babuschkin, I., Czarnecki, W.M. et al. Nature. 2019. [journal]

STDP based approximation of back-propagation in an energy based model. Yoshua Bengio, Thomas Mesnard, Asja Fischer, Saizheng Zhang, and

__Yuhuai Wu__. Neural computation. 2017. [journal][arxiv]Discrete Equidecomposability and Ehrhart Theory of Polygons. Paxton Turner,

__Yuhuai Wu__. Discrete & Computational Geometry. 2020. [journal][arxiv]

LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning.

__Yuhuai Wu__, Markus Rabe, Wenda Li, Jimmy Ba, Roger Grosse, Christian Szegedy. ICML 2021. [arxiv].Efficient Statistical Tests: A Neural Tangent Kernel Approach. Sheng Jia, Ehsan Nezhadarya,

__Yuhuai Wu__, Jimmy Ba. ICML 2021.INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving.

__Yuhuai Wu*__, Albert Q. Jiang*, Jimmy Ba, Roger Grosse. ICLR 2021. [arxiv].Modelling High-Level Mathematical Reasoning in Mechanised Declarative Proofs. Wenda Li, Lei Yu,

__Yuhuai Wu__, Lawrence C. Paulson, ICLR 2021. [arxiv].Learning Branching Heuristics for Propositional Model Counting. Pashootan Vaezipoor*, Gil Lederman*,

__Yuhuai Wu__, Chris J. Maddison, Roger Grosse, Edward Lee, Sanjit A. Seshia, Fahiem Bacchus. AAAI 2021. [arxiv].Options as responses: Grounding behavioural hierarchies in multi-agent RL.

__Yuhuai Wu*__, Alexander Sasha Vezhnevets*, Maria Eckstein, Remi Leblond, Joel Z. Leibo. ICML 2020. [paper][arxiv].Neural Theorem Proving on Inequality Problems.

__Yuhuai Wu*__, Albert Q. Jiang*, Roger Grosse, Jimmy Ba. AITP 2020. [paper].Learning Clause Deletion Heuristics with Reinforcement Learning. Pashootan Vaezipoor, Gil Lederman,

__Yuhuai Wu__, Roger Grosse, Fahiem Bacchus. AITP 2020. [paper].The Importance of Sampling in Meta-Reinforcement Learning. Bradly Stadie, Ge Yang, Rein Houthooft, Xi Chen, Yan Duan,

__Yuhuai Wu__, Pieter Abbeel, Ilya Sutskever. NeurIPS 2018. [paper].Understanding Short-Horizon Bias in Stochastic Meta-Optimization.

__Yuhuai Wu__*, Mengye Ren*, Renjie Liao, Roger B. Grosse. ICLR 2018. [paper][arxiv]Backpropagation through the Void: Optimizing control variates for black-box gradient estimation. Will Grathwohl, Dami Choi,

__Yuhuai Wu__, Geoff Roeder, David Duvenaud. ICLR 2018. [arxiv].Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation

__Yuhuai Wu*__, Elman Mansimov*, Shun Liao, Roger Grosse, Jimmy Ba. NIPS, 2017.**Spotlight**.

[arxiv][code][OpenAI blog].Sticking the Landing: An Asymptotically Zero-Variance Gradient Estimator for Variational Inference. Geoffrey Roeder,

__Yuhuai Wu__, David Duvenaud. NIPS, 2017. [arxiv]On the Quantitative Analysis of Decoder-Based Generative Models.

__Yuhuai Wu__, Yuri Burda, Ruslan Salakhutdinov and Roger Grosse. ICLR, 2017. [arxiv][code]On Multiplicative Integration with Recurrent Neural Networks.

__Yuhuai Wu*__, Saizheng Zhang*, Ying Zhang, Yoshua Bengio, Ruslan Salakhutdinov. NIPS, 2016. [arxiv]Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations. Behnam Neyshabur*,

__Yuhuai Wu*__, Ruslan Salakhutdinov, Nathan Srebro. NIPS, 2016. [arxiv]Architectural Complexity Measures of Recurrent Neural Networks. Saizheng Zhang*,

__Yuhuai Wu*__, Tong Che, Zhouhan Lin, Roland Memisevic, Ruslan Salakhutdinov, Yoshua Bengio. NIPS, 2016. [arxiv]

Understanding Short-Horizon Bias in Stochastic Meta-Optimization.

__Yuhuai Wu__*, Mengye Ren*, Renjie Liao, Roger B. Grosse. NIPS 2017 workshop in meta-learning. [workshop]Backpropagation through the Void: Optimizing control variates for black-box gradient estimation. Will Grathwohl, Dami Choi,

__Yuhuai Wu__, Geoff Roeder, David Duvenaud. NIPS 2017 Deep RL symposium.**Oral**. [workshop]On the Quantitative Analysis of Decoder-Based Generative Models.

__Yuhuai Wu__, Yuri Burda, Ruslan Salakhutdinov and Roger Grosse. NIPS 2016 workshop in adversarial training.**Oral**.[workshop][code]Sticking the Landing: A Simple Reduced-Variance Gradient for ADVI. Geoffrey Roeder,

__Yuhuai Wu__, David Duvenaud. NIPS 2016 workshop in Advances in Approximate Bayesian Inference. [workshop]

ACTRCE: Augmenting Experience via Teacher's Advice For Multi-Goal Reinforcement Learning.

__Yuhuai Wu*__, Harris Chan*, Jamie Kiros, Sanja Fidler, Jimmy Ba. 2019. [arxiv].Concurrent Meta Reinforcement Learning. Emilio Parisotto, Soham Ghosh, Sai Bhargav Yalamanchi, Varsha Chinnaobireddy,

__Yuhuai Wu__, Ruslan Salakhutdinov. 2019. [arxiv].An Empirical Analysis of Proximal Policy Optimization with Kronecker-factored Natural Gradients. Jiaming Song,

__Yuhuai Wu__. 2018. [arxiv].

Discrete Equidecomposability and Ehrhart Theory of Polygons. Paxton Turner,

__Yuhuai Wu__. Discrete & Computational Geometry. 2020. [journal][arxiv]Conditions for Discrete Equidecomposability of Polygons. Paxton Turner,

__Yuhuai Wu__. [arxiv]

* Equal contribution.

I am/was a reviewer for

NIPS2016, ICML2017, NIPS2017, AAAI2018, ICLR2018, 2018NIPS, 2019ICML, 2019NIPS, 2020ICLR, 2020NIPS.

I am/was a TA for

CSC 2541: Scalable and Flexible Models of Uncertainty (2017 fall)

CSC 321 : Introduction to Neural Networks (2017 spring)

ECE 521 : Inference Algorithms and Machine Learning (2017 spring)

CSC 2541: Differentiable Inference and Generative Models (2016 fall)

CSC 236: Introduction to the Theory of Computation (2016 summer)

CSC 148: Introduction to Computer Science (2016 spring)

CSC 165: Mathematical Expression and Reasoning for Computer Science (2015 fall)

Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation. Vector Institute Endless Summer School. 2017/11.

Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation. Microsoft Research Redmond. 2017/09.

Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation. Apple. 2017/09.

Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation. Google Brain. 2017/09.

On the Quantitative Analysis of Decoder-Based Generative Models. NIPS workshop in Adversarial training. 2016/12.

On the Quantitative Analysis of Decoder-Based Generative Models. OpenAI. 2016/11.

Architectural Complexity Measures & Multiplicative Integration of RNNs. U of Toronto. 2016/10.

Intro to Differential Geometry. U of Toronto. 2016/07.

Architectural Complexity Measures of Recurrent Neural Networks. Toyota Technological Institute at Chicago. 2016/04.