Michael Zhang
I’m Michael Zhang, a final-year PhD student at the University of Toronto and Vector Institute, where I am fortunate to be supervised by Jimmy Ba and supported by the NSERC CGS Fellowship and the Schwartz Reisman Graduate Fellowship.
I am interested in building safe, general-purpose machine learning systems. I did my undergrad and Master's at the University of California, Berkeley and have previously done internships at Tesla, Google Research, and LinkedIn.
You can contact me at [first name] @cs.toronto.edu.
CV  / 
Google Scholar  / 
Twitter  / 
Substack
|
|
News
- Dec 2023: We released a new preprint on using language models for hyperparameter tuning; we'll present this at the Neurips FMDM workshop. Arxiv link.
- Dec 2023: I gave an interview about AI Safety and related research directions at Toronto, which was featured in an U of T news article. Link
- July 2023: Presenting work on language model+prompting-based approach for answering course questions at ITiCSE in Finland.
- May 2023: I'm grateful to receive the Schwartz Reisman Institute graduate fellowship!
- Apr 2023: We released a new preprint on a prompt boosting algorithm with large language models: arxiv.
- Feb 2023: Multi-Rate VAE received an oral presentation (top-5% of accepted papers) at ICLR.
- Jan 2023: I am an instructor for CSC311: Intro to Machine Learning this semester!
|
Research
Some topics I am currently interested in:
-How can we develop current and future AI models that are more likely to be socially positive?
-Improving our understanding of deep neural network optimization
-Can we make hyperparameter tuning easier (e.g. more efficient and automated)?
-AI Safety and interdisciplinary thinking about technology (e.g. through SRI).
Please feel free to reach out if you have an idea you'd like to discuss. A full list of publications is on my Google Scholar.
|
Large Language Models for Hyperparameter Optimization
Michael R. Zhang , Nishkrit Desai, Juhan Bae, Jonathan Lorraine, Jimmy Ba
NeurIPS Foundation Models for Decision Making Workshop
Paper /
Code
We develop a methodology where LLMs suggest hyperparameters and show it can match or outperform traditional HPO methods like Bayesian optimization across different models on standard benchmarks.
|
Unlearnable Algorithms for In-context Learning
Andrei Muresanu, Anvith Thudi, Michael R. Zhang, Nicolas Papernot
Preprint
Arxiv
We analyze unlearning in the in-context learning setting and propose an algorithm that is amenable to unlearning and efficiently selects examples for in-context learning.
|
Decomposed Prompting to Answer Questions on a Course Discussion Board
Brandon Jaipersaud, Paul Zhang, Jimmy Ba, Andrew Petersen, Lisa Zhang, Michael R. Zhang
AIED 2023
Paper
We propose and evaluate a question-answering system that uses decomposed prompting to classify and answer student questions on a course discussion board.
|
Multi-Rate VAE: Train Once, Get the Full Rate-Distortion Curve
Juhan Bae, Michael R. Zhang, Michael Ruan, Eric Wang, So Hasegawa, Jimmy Ba, Roger Grosse
ICLR 2023 (top-5% of accepted papers)
Paper
We propose Multi-Rate VAE (MR-VAE), a hypernetwork which is capable of learning multiple VAEs with different rates in a single training run.
|
Autoregressive Models for Offline Policy Evaluation and Optimization
Michael R. Zhang, Tom Le Paine, Ofir Nachum, Cosmin Paduraru, George Tucker, Ziyu Wang, Mohammad Norouzi
ICLR 2021
Paper /
Video
Autoregressive models are better for learning dynamics models than standard feedforward models in the fixed dataset setting.
|
Analyzing Monotonic Linear Interpolation in Neural Network Loss Landscapes
James Lucas, Juhan Bae, Michael R. Zhang , Stanislav Fort, Richard Zemel, Roger Grosse
ICML 2021
Paper
We investigate and analyze why many neural networks have the loss decreasing monotonically in weight space from initialization to final solution.
|
Objective Social Choice: Using Auxiliary Information to Improve Voting Outcomes
Silviu Pitis, Michael R. Zhang
AAMAS 2020
Paper /
Code /
Video
Framework and aggregation rules for combining the preferences of multiple agents with noisy views of some ground truth.
|
Lookahead Optimizer: k steps forward, 1 step back
Michael R. Zhang, James Lucas, Geoffrey Hinton, Jimmy Ba
NeurIPS 2019
Paper /
Code /
Video
Optimization algorithm that speeds up training by using a search direction generated by multiple steps of an inner optimizer.
|
Reverse Curriculum Generation for Reinforcement Learning
Carlos Florensa, David Held, Markus Wulfmeier, Michael Zhang, Pieter Abbeel
CoRL 2017
Webpage /
Paper /
Blog post
Approach for tackling sparse reward tasks by generating goals of intermediate difficulty.
|
Probabilistically Safe Policy Transfer
David Held, Zoe McCarthy, Michael Zhang, Fred Shentu, Pieter Abbeel
ICRA 2017
Paper /
Video
Framework for safely transferring policies learning in simulation to the real world.
|
Misc.
I try to adhere to the principle "journey before destination" (Brandon Sanderson).
Some activities I enjoy: basketball, running, exploring new places, reading, improv.
2020 Book Recs
Model Comparison Dashboard
|
|