I am a third year PhD student supervised jointly by Rich Zemel and Roger Grosse.
I am interested in a wide range of questions applicable to deep learning. How do we train neural networks which generalize well? How should we be optimizing neural networks? How can we impose functional constraints on neural networks? Can we utilize uncertainty effectively in deep neural networks?
- Lookahead Optimizer: k steps forward, 1 step back
: Lookahead iteratively updates two sets of weights. Intuitively, the algorithm chooses a search direction by looking ahead at the sequence of "fast weights" generated by another optimizer and then uses linear interpolation to update the "slow weights".
- Sorting out Lipschitz function approximation
: Common activation functions are insufficient for norm-constrained (1-Lipschitz) network architectures. By using a gradient norm preserving activation, GroupSort, we prove universal approximation in this setting and achieve provable adversarial robustness with hinge loss.
In the Fall of 2017 I taught CSC411/2515 - Introduction to Machine Learning.