I am a second year PhD student supervised jointly by Rich Zemel and Roger Grosse.
I am interested in a wide range of questions applicable to deep learning. How do we train neural networks which generalize well? How should we be optimizing neural networks? How does optimization affect generalization? How can we impose functional constraints on neural networks? Can we utilize uncertainty effectively in deep neural networks?
Sorting out Lipschitz function approximation
: Common activation functions are insufficient for norm-constrained (1-Lipschitz) network architectures. By using a gradient norm preserving activation, GroupSort, we prove universal approximation in this setting and achieve provable adversarial robustness with hinge loss.
In the Fall of 2017 I taught CSC411/2515 - Introduction to Machine Learning.