Nikita Dhawan

Selected Publications

End-To-End Causal Effect Estimation from Unstructured Natural Language Data

Nikita Dhawan, Leonardo Cotta, Karen Ullrich, Rahul G. Krishnan, Chris J. Maddison

NeurIPS, 2024

We introduce NATURAL, a novel family of causal effect estimators built with LLMs that operate over datasets of unstructured text. Our estimators use LLM conditional distributions (over variables of interest, given the text data) to assist in the computation of classical estimators of causal effect. NATURAL estimators demonstrate remarkable performance, yielding causal effect estimates that fall within 3 percentage points of their ground truth counterparts, including on real-world Phase 3/4 clinical trials. Our results suggest that unstructured text data is a rich source of causal effect information, and NATURAL is a first step towards an automated pipeline to tap this resource.

website arXiv code

Leveraging Function Space Aggregation for Federated Learning at Scale

Nikita Dhawan, Nicole Mitchell, Zachary Charles, Zachary Garrett, Gintare Karolina Dziugaite

TMLR, 2024

Many federated learning algorithms, including the canonical Federated Averaging (FedAvg), take a direct (possibly weighted) average of the client parameter updates, motivated by results in distributed optimization. In this work, we adopt a function space perspective and propose a new algorithm, FedFish, that aggregates local approximations to the functions learned by clients, using an estimate based on their Fisher information. Our evaluation across several settings in image and language benchmarks shows that FedFish outperforms FedAvg as local training epochs increase. Further, FedFish results in global networks that are more amenable to efficient personalization via local fine-tuning on the same or shifted data distributions.

arXiv

Efficient Parametric Approximations of Neural Network Function Space Distance

Nikita Dhawan, Sicong (Sheldon) Huang, Juhan Bae, Roger Grosse

ICML, 2023

We consider a specific case of approximating the function space distance (FSD) over the training set, i.e. the average distance between the outputs of two ReLU neural networks, based on approximating the architecture as a linear network with stochastic gating. Despite requiring only one parameter per unit of the network, our parametric approximation is competitive with state-of-the-art nonparametric approximations with larger memory requirements, when applied to continual learning and influence function estimation.

arXiv code

Education and Experience

PhD Student

September 2021 -- Present

University of Toronto, Vector Institute

Machine Learning group; advised by Chris Maddison and Roger Grosse.

Research Scientist Intern

December 2023 -- May 2024

Meta AI

Worked on causal effect estimation using natural language and LLMs; hosted by Karen Ullrich.

Student Researcher

April 2023 -- December 2023

Google Research

Worked on Federated Learning; hosted by Nicole Mitchell and Karolina Dziugaite.

Undergraduate Student

August 2017 -- December 2020

UC Berkeley

Bachelor of Arts in Computer Science and Applied Math.

Highest Honors in Applied Math and High Distinction in General Scholarship.

Undergraduate Researcher

May 2019 -- December 2020

UC Berkeley

Berkeley Artificial Intelligence Research group; supervised by Sergey Levine.

Teaching

Course Instructor

CSC 413: Neural Networks and Deep Learning, University of Toronto

Winter 2025

Teaching Assistant

CSC 311: Introduction to Machine Learning, University of Toronto

Fall 2021 Summer 2022 (Prep)

Teaching Assistant

EECS 126: Probability and Random Processes, UC Berkeley

Fall 2020 Spring 2020

Reader

EECS 229A: Information Theory and Coding, UC Berkeley

Fall 2020