Nikita Dhawan


I am a PhD student at the University of Toronto and the Vector Institute, supervised by Professors Chris Maddison and Roger Grosse. I completed my Bachelors in Computer Science and Applied Math at UC Berkeley, where I enjoyed working with Professor Sergey Levine.


Email  /  LinkedIn  /  Google Scholar  /  Twitter

profile photo
Research

I am interested in developing algorithms and pipelines for reliable and trustworthy machine learning, with a particular focus on healthcare applications.

End-To-End Causal Effect Estimation from Unstructured Natural Language Data

Nikita Dhawan, Leonardo Cotta, Karen Ullrich, Rahul G. Krishnan, Chris J. Maddison
website / arXiv / code

We introduce NATURAL, a novel family of causal effect estimators built with LLMs that operate over datasets of unstructured text. Our estimators use LLM conditional distributions (over variables of interest, given the text data) to assist in the computation of classical estimators of causal effect. NATURAL estimators demonstrate remarkable performance, yielding causal effect estimates that fall within 3 percentage points of their ground truth counterparts, including on real-world Phase 3/4 clinical trials. Our results suggest that unstructured text data is a rich source of causal effect information, and NATURAL is a first step towards an automated pipeline to tap this resource.

Leveraging Function Space Aggregation for Federated Learning at Scale

Nikita Dhawan, Nicole Mitchell, Zachary Charles , Zachary Garrett, Gintare Karolina Dziugaite
TMLR, 2024
arXiv

Many federated learning algorithms, including the canonical Federated Averaging (FedAvg), take a direct (possibly weighted) average of the client parameter updates, motivated by results in distributed optimization. In this work, we adopt a function space perspective and propose a new algorithm, FedFish, that aggregates local approximations to the functions learned by clients, using an estimate based on their Fisher information. Our evaluation across several settings in image and language benchmarks shows that FedFish outperforms FedAvg as local training epochs increase. F Further, FedFish results in global networks that are more amenable to efficient personalization via local fine-tuning on the same or shifted data distributions.

Efficient Parametric Approximations of Neural Network Function Space Distance

Nikita Dhawan, Sicong (Sheldon) Huang, Juhan Bae, Roger Grosse
ICML, 2023
arXiv / code

We consider a specific case of approximating the function space distance (FSD) over the training set, i.e. the average distance between the outputs of two ReLU neural networks, based on approximating the architecture as a linear network with stochastic gating. Despite requiring only one parameter per unit of the network, our parametric approximation is competitive with state-of-the-art nonparametric approximations with larger memory requirements, when applied to continual learning and influence function estimation.

Dataset Inference for Self-Supervised Models

Adam Dziedzic, Haonan Duan, Muhammad Ahmad Kaleem, Nikita Dhawan, Jonas Guan, Yannis Cattan, Franziska Boenisch, Nicolas Papernot
NeurIPS, 2022
arXiv

We introduce a new dataset inference defense for self-supervised models, which uses the intuition that the log-likelihood of an encoder's output representations is higher on the victim's training data than on test data if it is stolen from the victim, but not if it is independently trained. Our extensive empirical results in the vision domain demonstrate that dataset inference is a promising direction for defending self-supervised models against model stealing.

On the Difficulty of Defending Self-Supervised Learning against Model Extraction

Adam Dziedzic, Nikita Dhawan, Muhammad Ahmad Kaleem, Jonas Guan, Nicolas Papernot
ICML, 2022
arXiv

Recently, ML-as-a-Service providers have commenced offering trained self-supervised models over inference APIs, which transform user inputs into useful representations for a fee. However, the high cost involved to train these models and their exposure over APIs both make black-box extraction a realistic security threat. We explore model stealing by constructing several novel attacks and evaluating existing classes of defenses.

ARM: A Meta-Learning Approach for Tackling Group Shift

Marvin Zhang*, Henrik Marklund*, Nikita Dhawan*, Abhishek Gupta, Sergey Levine, Chelsea Finn
NeurIPS, 2021
website / arXiv

Machine learning systems are regularly tested under distribution shift, in real-life applications. In this work, we consider the setting where the training data are structured into groups and test time shifts correspond to changes in the group distribution. We propose to use ideas from meta-learning to learn models that are adaptable, and introduce the framework of adaptive risk minimization (ARM), a formalization of this setting.

AVID: Learning Multi-Stage Tasks via Pixel-Level Translation of Human Videos

Laura Smith, Nikita Dhawan, Marvin Zhang, Pieter Abbeel, Sergey Levine,
RSS, 2020
website / arXiv / blog

Humans can learn from watching others, imagining how they would perform the task themselves, and then practicing on their own. Can robots do the same? We adopt a similar strategy of imagination and practice in this project to solve complex, long-horizon tasks, like operating a coffee machine or getting objects from within a closed drawer.

Experience
Research Scientist Intern
Meta, December 2023 -- May 2024
Worked on causal effect estimation using natural language and LLMs; hosted by Karen Ullrich.

Student Researcher
Google, April 2023 -- December 2023
Worked on Federated Learning; hosted by Nicole Mitchell and Karolina Dziugaite.
Teaching
Teaching Assistant, CSC 311: Introduction to Machine Learning
Fall 2021 (University of Toronto)

Teaching Assistant, EECS 126: Probability and Random Processes
Fall 2020, Spring 2020 (UC Berkeley)

Reader, EECS 229A: Information Theory and Coding
Fall 2020 (UC Berkeley)

Template