Research
I am interested in developing algorithms and pipelines for reliable and trustworthy machine learning, with a particular focus on healthcare applications.
|
End-To-End Causal Effect Estimation from Unstructured Natural Language Data
Nikita Dhawan,
Leonardo Cotta,
Karen Ullrich,
Rahul G. Krishnan,
Chris J. Maddison
website /
arXiv /
code
We introduce NATURAL, a novel family of causal effect estimators built with LLMs that operate over datasets of unstructured text.
Our estimators use LLM conditional distributions (over variables of interest, given the text data) to assist in the computation of classical estimators of causal effect.
NATURAL estimators demonstrate remarkable performance, yielding causal effect estimates that fall within 3 percentage points of their ground truth counterparts, including on real-world Phase 3/4 clinical trials.
Our results suggest that unstructured text data is a rich source of causal effect information, and NATURAL is a first step towards an automated pipeline to tap this resource.
|
Leveraging Function Space Aggregation for Federated Learning at Scale
Nikita Dhawan,
Nicole Mitchell,
Zachary Charles ,
Zachary Garrett,
Gintare Karolina Dziugaite
TMLR, 2024
arXiv
Many federated learning algorithms, including the canonical Federated Averaging (FedAvg), take a direct (possibly weighted) average of the client parameter updates, motivated by results in distributed optimization.
In this work, we adopt a function space perspective and propose a new algorithm, FedFish, that aggregates local approximations to the functions learned by clients, using an estimate based on their Fisher information.
Our evaluation across several settings in image and language benchmarks shows that FedFish outperforms FedAvg as local training epochs increase. F
Further, FedFish results in global networks that are more amenable to efficient personalization via local fine-tuning on the same or shifted data distributions.
|
Efficient Parametric Approximations of Neural Network Function Space Distance
Nikita Dhawan,
Sicong (Sheldon) Huang,
Juhan Bae,
Roger Grosse
ICML, 2023
arXiv /
code
We consider a specific case of approximating the function space distance (FSD) over the
training set, i.e. the average distance between the outputs of two ReLU neural networks, based on approximating the architecture as a
linear network with stochastic gating. Despite requiring only one
parameter per unit of the network, our parametric approximation is competitive with state-of-the-art nonparametric approximations
with larger memory requirements, when applied to continual learning and influence function estimation.
|
Dataset Inference for Self-Supervised Models
Adam Dziedzic,
Haonan Duan,
Muhammad Ahmad Kaleem,
Nikita Dhawan,
Jonas Guan,
Yannis Cattan,
Franziska Boenisch,
Nicolas Papernot
NeurIPS, 2022
arXiv
We introduce a new dataset inference defense for self-supervised models, which uses the intuition that the log-likelihood of an encoder's output representations is higher on the victim's training data than
on test data if it is stolen from the victim, but not if it is independently trained. Our extensive empirical results in the vision
domain demonstrate that dataset inference is a promising direction for defending self-supervised models against model stealing.
|
On the Difficulty of Defending Self-Supervised Learning against Model Extraction
Adam Dziedzic,
Nikita Dhawan,
Muhammad Ahmad Kaleem,
Jonas Guan,
Nicolas Papernot
ICML, 2022
arXiv
Recently, ML-as-a-Service providers have commenced offering trained self-supervised models over inference APIs, which transform
user inputs into useful representations for a fee. However, the high cost involved to train these models and their exposure over APIs
both make black-box extraction a realistic security threat. We explore model stealing by constructing several novel attacks and evaluating
existing classes of defenses.
|
ARM: A Meta-Learning Approach for Tackling Group Shift
Marvin Zhang*,
Henrik Marklund*,
Nikita Dhawan*,
Abhishek Gupta,
Sergey Levine,
Chelsea Finn
NeurIPS, 2021
website /
arXiv
Machine learning systems are regularly tested under distribution shift, in real-life applications. In this work, we consider the setting where
the training data are structured into groups and test time shifts correspond to changes in the group distribution. We propose to use ideas from
meta-learning to learn models that are adaptable, and introduce the framework of adaptive risk minimization (ARM), a formalization of this setting.
|
AVID: Learning Multi-Stage Tasks via Pixel-Level Translation of Human Videos
Laura Smith,
Nikita Dhawan,
Marvin Zhang,
Pieter Abbeel,
Sergey Levine,
RSS, 2020
website /
arXiv /
blog
Humans can learn from watching others, imagining how they would perform the task themselves, and then practicing on their own.
Can robots do the same? We adopt a similar strategy of imagination and practice in this project to solve complex, long-horizon tasks,
like operating a coffee machine or getting objects from within a closed drawer.
|
Research Scientist Intern
Meta, December 2023 -- May 2024
Worked on causal effect estimation using natural language and LLMs; hosted by Karen Ullrich.
Student Researcher
Google, April 2023 -- December 2023
Worked on Federated Learning; hosted by Nicole Mitchell and Karolina Dziugaite.
|
Teaching Assistant, CSC 311: Introduction to Machine Learning
Fall 2021 (University of Toronto)
Teaching Assistant, EECS 126: Probability and Random Processes
Fall 2020, Spring 2020 (UC Berkeley)
Reader, EECS 229A: Information Theory and Coding
Fall 2020 (UC Berkeley)
|
|