Classified by Research TopicSorted by DateClassified by Publication Type

Distance Estimation for High-Dimensional Discrete Distributions

Distance Estimation for High-Dimensional Discrete Distributions.
Kuldeep S. Meel, Gunjan Kumar and Yash Pote.
In Proceedings of International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 955–963, May 2025.

Download

[PDF] 

Abstract

Given two distributions $\mathcalP$ and $\mathcalQ$ over a high-dimensional domain $\0,1\^n$, and a parameter $\varepsilon$, the goal of distance estimation is to determine the statistical distance between $\mathcalP$ and $\mathcalQ$, up to an additive tolerance $\pm \varepsilon$. Since exponential lower bounds (in $n$) are known for the problem in the standard sampling model, research has focused on richer query models where one can draw conditional samples. This paper presents the first polynomial query distance estimator in the conditional sampling model ($\mathsfCOND$). We base our algorithm on the relatively weaker subcube conditional sampling ($\mathsfSUBCOND$) oracle, which draws samples from the distribution conditioned on some of the dimensions. $\mathsfSUBCOND$ is a promising model for widespread practical use because it captures the natural behavior of discrete samplers. Our algorithm makes $\tilde\mathcalO(n^3/\varepsilon^5)$ queries to $\mathsfSUBCOND$.

BibTeX

@inproceedings{MKP25,
  title={Distance Estimation for High-Dimensional Discrete Distributions},
  author={Meel, Kuldeep S. and Kumar, Gunjan and Pote, Yash},
  booktitle=AISTATS,
  pages={955--963},
  year={2025},
  month=may,
  bib2html_rescat={Distribution Testing},
  bib2html_pubtype={Refereed Conference},
  bib2html_dl_pdf={https://arxiv.org/pdf/2308.04264},
  abstract={
    Given two distributions $\mathcal{P}$ and $\mathcal{Q}$ over a
    high-dimensional domain $\{0,1\}^n$, and a parameter $\varepsilon$, the goal
    of distance estimation is to determine the statistical distance between
    $\mathcal{P}$ and $\mathcal{Q}$, up to an additive tolerance $\pm
    \varepsilon$. Since exponential lower bounds (in $n$) are known for the
    problem in the standard sampling model, research has focused on richer query
    models where one can draw conditional samples. This paper presents the first
    polynomial query distance estimator in the conditional sampling model
    ($\mathsf{COND}$). We base our algorithm on the relatively weaker
    \textit{subcube conditional} sampling ($\mathsf{SUBCOND}$) oracle, which
    draws samples from the distribution conditioned on some of the dimensions.
    $\mathsf{SUBCOND}$ is a promising model for widespread practical use because
    it captures the natural behavior of discrete samplers. Our algorithm makes
    $\tilde{\mathcal{O}}(n^3/\varepsilon^5)$ queries to $\mathsf{SUBCOND}$.
  },
}

Generated by bib2html.pl (written by Patrick Riley with layout from Sanjit A. Seshia ) on Tue Apr 28, 2026 01:27:21