A New Information Complexity Measure for Multi-pass Streaming with Applications

Braverman, Mark; Garg, Sumegha; Li, Qian; Wang, Shuo; Woodruff, David P.; Zhang, Jiapeng

Abstract:We introduce a new notion of information complexity for multi-pass streaming problems and use it to resolve several important questions in data streams.
In the coin problem, one sees a stream of $n$ i.i.d. uniform bits and one would like to compute the majority with constant advantage. We show that any constant pass algorithm must use $\Omega(\log n)$ bits of memory, significantly extending an earlier $\Omega(\log n)$ bit lower bound for single-pass algorithms of Braverman-Garg-Woodruff (FOCS, 2020). This also gives the first $\Omega(\log n)$ bit lower bound for the problem of approximating a counter up to a constant factor in worst-case turnstile streams for more than one pass.
In the needle problem, one either sees a stream of $n$ i.i.d. uniform samples from a domain $[t]$, or there is a randomly chosen needle $\alpha \in[t]$ for which each item independently is chosen to equal $\alpha$ with probability $p$, and is otherwise uniformly random in $[t]$. The problem of distinguishing these two cases is central to understanding the space complexity of the frequency moment estimation problem in random order streams. We show tight multi-pass space bounds for this problem for every $p < 1/\sqrt{n \log^3 n}$, resolving an open question of Lovett and Zhang (FOCS, 2023); even for $1$-pass our bounds are new. To show optimality, we improve both lower and upper bounds from existing results.
Our information complexity framework significantly extends the toolkit for proving multi-pass streaming lower bounds, and we give a wide number of additional streaming applications of our lower bound techniques, including multi-pass lower bounds for $\ell_p$-norm estimation, $\ell_p$-point query and heavy hitters, and compressed sensing problems.

Comments:	To appear in STOC 2024
Subjects:	Computational Complexity (cs.CC); Data Structures and Algorithms (cs.DS)
Cite as:	arXiv:2403.20283 [cs.CC]
	(or arXiv:2403.20283v1 [cs.CC] for this version)
	https://doi.org/10.48550/arXiv.2403.20283

Computer Science > Computational Complexity

Title:A New Information Complexity Measure for Multi-pass Streaming with Applications

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators