Nikita Dhawan

I am a PhD student at the University of Toronto and Vector Institute, advised by Chris Maddison and Roger Grosse. Previously, I completed a Bachelor's degree in Computer Science and Applied Math at UC Berkeley. I enjoy building machine learning pipelines and assessing their reliability. I take particular interest in their potential to improve healthcare, in efficacy and efficiency.
Nikita Dhawan

Selected Publications

End-To-End Causal Effect Estimation from Unstructured Natural Language Data
NeurIPS, 2024

We introduce NATURAL, a novel family of causal effect estimators built with LLMs that operate over datasets of unstructured text. Our estimators use LLM conditional distributions (over variables of interest, given the text data) to assist in the computation of classical estimators of causal effect. NATURAL estimators demonstrate remarkable performance, yielding causal effect estimates that fall within 3 percentage points of their ground truth counterparts, including on real-world Phase 3/4 clinical trials. Our results suggest that unstructured text data is a rich source of causal effect information, and NATURAL is a first step towards an automated pipeline to tap this resource.

Leveraging Function Space Aggregation for Federated Learning at Scale
TMLR, 2024

Many federated learning algorithms, including the canonical Federated Averaging (FedAvg), take a direct (possibly weighted) average of the client parameter updates, motivated by results in distributed optimization. In this work, we adopt a function space perspective and propose a new algorithm, FedFish, that aggregates local approximations to the functions learned by clients, using an estimate based on their Fisher information. Our evaluation across several settings in image and language benchmarks shows that FedFish outperforms FedAvg as local training epochs increase. Further, FedFish results in global networks that are more amenable to efficient personalization via local fine-tuning on the same or shifted data distributions.

Efficient Parametric Approximations of Neural Network Function Space Distance
ICML, 2023

We consider a specific case of approximating the function space distance (FSD) over the training set, i.e. the average distance between the outputs of two ReLU neural networks, based on approximating the architecture as a linear network with stochastic gating. Despite requiring only one parameter per unit of the network, our parametric approximation is competitive with state-of-the-art nonparametric approximations with larger memory requirements, when applied to continual learning and influence function estimation.

On the Difficulty of Defending Self-Supervised Learning against Model Extraction
ICML, 2022

Recently, ML-as-a-Service providers have commenced offering trained self-supervised models over inference APIs, which transform user inputs into useful representations for a fee. However, the high cost involved to train these models and their exposure over APIs both make black-box extraction a realistic security threat. We explore model stealing by constructing several novel attacks and evaluating existing classes of defenses.

Education and Experience

PhD Student

September 2021 -- Present

University of Toronto, Vector Institute

Machine Learning group; advised by Chris Maddison and Roger Grosse.

Research Scientist Intern

December 2023 -- May 2024

Meta AI

Worked on causal effect estimation using natural language and LLMs; hosted by Karen Ullrich.

Student Researcher

April 2023 -- December 2023

Google Research

Worked on Federated Learning; hosted by Nicole Mitchell and Karolina Dziugaite.

Undergraduate Student

August 2017 -- December 2020

UC Berkeley

Bachelor of Arts in Computer Science and Applied Math.

Highest Honors in Applied Math and High Distinction in General Scholarship.

Undergraduate Researcher

May 2019 -- December 2020

UC Berkeley

Berkeley Artificial Intelligence Research group; supervised by Sergey Levine.

Teaching

Course Instructor

CSC 413: Neural Networks and Deep Learning, University of Toronto

Teaching Assistant

CSC 311: Introduction to Machine Learning, University of Toronto

Teaching Assistant

EECS 126: Probability and Random Processes, UC Berkeley

Reader

EECS 229A: Information Theory and Coding, UC Berkeley