Classified by Research TopicSorted by DateClassified by Publication Type

Improved Streaming Algorithm for the Klee's Measure Problem and Generalizations

Improved Streaming Algorithm for the Klee's Measure Problem and Generalizations .
Mridul Nandi ⓡ N. V. Vinodchandran ⓡ Arijit Ghosh ⓡ Kuldeep S. Meel ⓡ Soumit Pal ⓡ Chakraborty,Sourav .
In Proceedings of the International Conference on Approximation Algorithms for Combinatorial Optimization Problems (APPROX), August 2024.

Download

[PDF] 

Abstract

Estimating the size of the union of a stream of sets $S_1, S_2, łdots S_M$ where each set is a subset of a known universe $\Omega$ is a fundamental problem in data streaming. This problem naturally generalizes the well-studied $\FO$ estimation problem in the streaming literature, where each set contains a single element from the universe. We consider the general case when the sets $S_i$ can be succinctly represented and allow efficient membership, cardinality, and sampling queries (called a Delphic family of sets). A notable example in this framework is the Klee's Measure Problem (KMP), where every set $S_i$ is an axis-parallel rectangle in $d$-dimensional spaces ($\Omega = [\Delta]^d$ where $[\Delta] := łeft\1, \dots, \Delta \right\$ and $\Delta \in \mathbbN$). Recently, Meel, Chakraborty, and Vinodchandran (PODS-21, PODS-22) designed a streaming algorithm for $(\epsilon,\delta)$-estimation of the size of the union of set streams over Delphic family with space and update time complexity $Ołeft(\fracłog^3 |\Omega|\varepsilon^2\cdot łog \frac1\delta\right)$ and $\widetildeOłeft(\fracłog^4 |\Omega|\varepsilon^2\cdot łog \frac1\delta\right)$, respectively. This work presents a new, sampling-based algorithm for estimating the size of the union of Delphic sets that has space and update time complexity $\widetildeOłeft(\fracłog^2 |\Omega|\varepsilon^2\cdot łog \frac1\delta\right)$. This improves the space complexity bound by a $łog|\Omega|$ factor and update time complexity bound by a $łog^2 |\Omega|$ factor. A critical question is whether quadratic dependence of $łog |\Omega|$ on space and update time complexities is necessary. Specifically, can we design a streaming algorithm for estimating the size of the union of sets over Delphic family with space and complexity linear in $łog |\Omega|$ and update time $\poly(łog |\Omega|)$? While this appears technically challenging, we show that establishing a lower bound of $ømega(łog |\Omega|)$ with $\poly(łog |\Omega|)$ update time is beyond the reach of current techniques. Specifically, we show that under certain hard-to-prove computational complexity hypothesis, there is a streaming algorithm for the problem with optimal space complexity $O(łog |\Omega|)$ update time $\poly(łog(|\Omega|))$. Thus, establishing a space lower bound of $ømega(łog |\Omega|)$ will lead to break-through complexity class separation results.

BibTeX

@inproceedings{MVGMPC24,
  title={
    Improved Streaming Algorithm for the Klee's Measure Problem and
    Generalizations
  },
  author={
    Nandi, Mridul and Vinodchandran, N. V.
    and Ghosh, Arijit
    and Meel, Kuldeep S.
    and Pal, Soumit
    and Chakraborty,Sourav
  },
  booktitle=APPROX,
  abstract={
    Estimating the size of the union of a stream of sets $S_1, S_2, \ldots S_M$
    where each set is a subset of a known universe $\Omega$ is a fundamental
    problem in data streaming. This problem naturally generalizes the
    well-studied $\FO$ estimation problem in the streaming literature, where
    each set contains a single element from the universe. We consider the
    general case when the sets $S_i$ can be succinctly represented and allow
    efficient membership, cardinality, and sampling queries (called a Delphic
    family of sets). A notable example in this framework is the Klee's Measure
    Problem (KMP), where every set $S_i$ is an axis-parallel rectangle in
    $d$-dimensional spaces ($\Omega = [\Delta]^d$ where $[\Delta] := \left\{1,
    \dots, \Delta \right\}$ and $\Delta \in \mathbb{N}$). Recently, Meel,
    Chakraborty, and Vinodchandran (PODS-21, PODS-22) designed a streaming
    algorithm for $(\epsilon,\delta)$-estimation of the size of the union of set
    streams over Delphic family
    with space and update time complexity $O\left(\frac{\log^3
    |\Omega|}{\varepsilon^2}\cdot \log \frac{1}{\delta}\right)$ and
    $\widetilde{O}\left(\frac{\log^{4} |\Omega|}{\varepsilon^2}\cdot \log
    \frac{1}{\delta}\right)$, respectively.
    This work presents a new, sampling-based algorithm for estimating the size
    of the union of Delphic sets that has space and update time complexity
    $\widetilde{O}\left(\frac{\log^2 |\Omega|}{\varepsilon^2}\cdot \log
    \frac{1}{\delta}\right)$. This improves the space complexity bound by a
    $\log|\Omega|$ factor and update time complexity bound by a $\log^2
    |\Omega|$ factor.
    A critical question is whether quadratic dependence of $\log |\Omega|$ on
    space and update time complexities is necessary. Specifically, can we design
    a streaming algorithm for estimating the size of the union of sets over
    Delphic family with space and complexity linear in $\log |\Omega|$ and
    update time $\poly(\log |\Omega|)$? While this appears technically
    challenging, we show that establishing a lower bound of $\omega(\log
    |\Omega|)$ with $\poly(\log |\Omega|)$ update time is beyond the reach of
    current techniques. Specifically, we show that under certain hard-to-prove
    computational complexity hypothesis, there is a streaming algorithm for the
    problem with optimal space complexity $O(\log |\Omega|)$ update time
    $\poly(\log(|\Omega|))$. Thus, establishing a space lower bound of
    $\omega(\log |\Omega|)$ will lead to break-through complexity class
    separation results.
  },
  nameorder={random},
  year={2024},
  month=aug,
  bib2html_rescat={Data Streams},
  bib2html_pubtype={Refereed Conference},
  bib2html_dl_pdf={},
}

Generated by bib2html.pl (written by Patrick Riley with layout from Sanjit A. Seshia ) on Tue Apr 28, 2026 01:27:21