-
Unbounded Error Correcting Codes
Authors:
Klim Efremenko,
Or Zamir
Abstract:
Traditional error-correcting codes (ECCs) assume a fixed message length, but many scenarios involve ongoing or indefinite transmissions where the message length is not known in advance. For example, when streaming a video, the user should be able to fix a fraction of errors that occurred before any point in time. We introduce unbounded error-correcting codes (unbounded codes), a natural generaliza…
▽ More
Traditional error-correcting codes (ECCs) assume a fixed message length, but many scenarios involve ongoing or indefinite transmissions where the message length is not known in advance. For example, when streaming a video, the user should be able to fix a fraction of errors that occurred before any point in time. We introduce unbounded error-correcting codes (unbounded codes), a natural generalization of ECCs that supports arbitrarily long messages without a predetermined length. An unbounded code with rate $R$ and distance $\varepsilon$ ensures that for every sufficiently large $k$, the message prefix of length $Rk$ can be recovered from the code prefix of length $k$ even if an adversary corrupts up to an $\varepsilon$ fraction of the symbols in this code prefix.
We study unbounded codes over binary alphabets in the regime of small error fraction $\varepsilon$, establishing nearly tight upper and lower bounds on their optimal rate. Our main results show that: (1) The optimal rate of unbounded codes satisfies $R<1-Ω(\sqrt{\varepsilon})$ and $R>1-O(\sqrt{\varepsilon \log \log(1/\varepsilon)})$. (2) Surprisingly, our construction is inherently non-linear, as we prove that linear unbounded codes achieve a strictly worse rate of $R=1-Θ(\sqrt{\varepsilon \log(1/\varepsilon)})$. (3) In the setting of random noise, unbounded codes achieve the same optimal rate as standard ECCs, $R=1-Θ(\varepsilon \log(1/\varepsilon))$.
These results demonstrate fundamental differences between standard and unbounded codes.
△ Less
Submitted 8 April, 2025; v1 submitted 7 November, 2024;
originally announced November 2024.
-
Sumsets in the Hypercube
Authors:
Noga Alon,
Or Zamir
Abstract:
A subset $S$ of the Boolean hypercube $\mathbb{F}_2^n$ is a sumset if $S = A+A = \{a + b \ | \ a, b\in A\}$ for some $A \subseteq \mathbb{F}_2^n$. We prove that the number of sumsets in $\mathbb{F}_2^n$ is asymptotically $(2^n-1)2^{2^{n-1}}$. Furthermore, we show that the family of sumsets in $\mathbb{F}_2^n$ is almost identical to the family of all subsets of $\mathbb{F}_2^n$ that contain a compl…
▽ More
A subset $S$ of the Boolean hypercube $\mathbb{F}_2^n$ is a sumset if $S = A+A = \{a + b \ | \ a, b\in A\}$ for some $A \subseteq \mathbb{F}_2^n$. We prove that the number of sumsets in $\mathbb{F}_2^n$ is asymptotically $(2^n-1)2^{2^{n-1}}$. Furthermore, we show that the family of sumsets in $\mathbb{F}_2^n$ is almost identical to the family of all subsets of $\mathbb{F}_2^n$ that contain a complete linear subspace of co-dimension $1$.
△ Less
Submitted 16 April, 2024; v1 submitted 25 March, 2024;
originally announced March 2024.
-
Testing Sumsets is Hard
Authors:
Xi Chen,
Shivam Nadimpalli,
Tim Randolph,
Rocco A. Servedio,
Or Zamir
Abstract:
A subset $S$ of the Boolean hypercube $\mathbb{F}_2^n$ is a sumset if $S = \{a + b : a, b\in A\}$ for some $A \subseteq \mathbb{F}_2^n$. Sumsets are central objects of study in additive combinatorics, featuring in several influential results. We prove a lower bound of $Ω(2^{n/2})$ for the number of queries needed to test whether a Boolean function $f:\mathbb{F}_2^n \to \{0,1\}$ is the indicator fu…
▽ More
A subset $S$ of the Boolean hypercube $\mathbb{F}_2^n$ is a sumset if $S = \{a + b : a, b\in A\}$ for some $A \subseteq \mathbb{F}_2^n$. Sumsets are central objects of study in additive combinatorics, featuring in several influential results. We prove a lower bound of $Ω(2^{n/2})$ for the number of queries needed to test whether a Boolean function $f:\mathbb{F}_2^n \to \{0,1\}$ is the indicator function of a sumset. Our lower bound for testing sumsets follows from sharp bounds on the related problem of shift testing, which may be of independent interest. We also give a near-optimal $2^{n/2} \cdot \mathrm{poly}(n)$-query algorithm for a smoothed analysis formulation of the sumset refutation problem.
△ Less
Submitted 4 February, 2024; v1 submitted 14 January, 2024;
originally announced January 2024.
-
Essentially tight bounds for rainbow cycles in proper edge-colourings
Authors:
Noga Alon,
Matija Bucić,
Lisa Sauermann,
Dmitrii Zakharov,
Or Zamir
Abstract:
An edge-coloured graph is said to be rainbow if no colour appears more than once. Extremal problems involving rainbow objects have been a focus of much research over the last decade as they capture the essence of a number of interesting problems in a variety of areas. A particularly intensively studied question due to Keevash, Mubayi, Sudakov and Verstraëte from 2007 asks for the maximum possible…
▽ More
An edge-coloured graph is said to be rainbow if no colour appears more than once. Extremal problems involving rainbow objects have been a focus of much research over the last decade as they capture the essence of a number of interesting problems in a variety of areas. A particularly intensively studied question due to Keevash, Mubayi, Sudakov and Verstraëte from 2007 asks for the maximum possible average degree of a properly edge-coloured graph on $n$ vertices without a rainbow cycle. Improving upon a series of earlier bounds, Tomon proved an upper bound of $(\log n)^{2+o(1)}$ for this question. Very recently, Janzer-Sudakov and Kim-Lee-Liu-Tran independently removed the $o(1)$ term in Tomon's bound, showing a bound of $O(\log^2 n)$. We prove an upper bound of $(\log n)^{1+o(1)}$ for this maximum possible average degree when there is no rainbow cycle. Our result is tight up to the $o(1)$ term, and so it essentially resolves this question. In addition, we observe a connection between this problem and several questions in additive number theory, allowing us to extend existing results on these questions for abelian groups to the case of non-abelian groups.
△ Less
Submitted 26 February, 2025; v1 submitted 8 September, 2023;
originally announced September 2023.
-
Algorithmic Applications of Hypergraph and Partition Containers
Authors:
Or Zamir
Abstract:
We present a general method to convert algorithms into faster algorithms for almost-regular input instances. Informally, an almost-regular input is an input in which the maximum degree is larger than the average degree by at most a constant factor. This family of inputs vastly generalizes several families of inputs for which we commonly have improved algorithms, including bounded-degree inputs and…
▽ More
We present a general method to convert algorithms into faster algorithms for almost-regular input instances. Informally, an almost-regular input is an input in which the maximum degree is larger than the average degree by at most a constant factor. This family of inputs vastly generalizes several families of inputs for which we commonly have improved algorithms, including bounded-degree inputs and random inputs. It also generalizes families of inputs for which we don't usually have faster algorithms, including regular-inputs of arbitrarily high degree and very dense inputs. We apply our method to achieve breakthroughs in exact algorithms for several central NP-Complete problems including $k$-SAT, Graph Coloring, and Maximum Independent Set.
Our main tool is the first algorithmic application of the relatively new Hypergraph Container Method (Saxton and Thomason 2015, Balogh, Morris and Samotij 2015). This recent breakthrough, which generalizes an earlier version for graphs (Kleitman and Winston 1982, Sapozhenko 2001), has been used extensively in recent years in extremal combinatorics. An important component of our work is the generalization of (hyper-)graph containers to Partition Containers.
△ Less
Submitted 21 November, 2022;
originally announced November 2022.
-
Randomized Dimensionality Reduction for Facility Location and Single-Linkage Clustering
Authors:
Shyam Narayanan,
Sandeep Silwal,
Piotr Indyk,
Or Zamir
Abstract:
Random dimensionality reduction is a versatile tool for speeding up algorithms for high-dimensional problems. We study its application to two clustering problems: the facility location problem, and the single-linkage hierarchical clustering problem, which is equivalent to computing the minimum spanning tree. We show that if we project the input pointset $X$ onto a random $d = O(d_X)$-dimensional s…
▽ More
Random dimensionality reduction is a versatile tool for speeding up algorithms for high-dimensional problems. We study its application to two clustering problems: the facility location problem, and the single-linkage hierarchical clustering problem, which is equivalent to computing the minimum spanning tree. We show that if we project the input pointset $X$ onto a random $d = O(d_X)$-dimensional subspace (where $d_X$ is the doubling dimension of $X$), then the optimum facility location cost in the projected space approximates the original cost up to a constant factor. We show an analogous statement for minimum spanning tree, but with the dimension $d$ having an extra $\log \log n$ term and the approximation factor being arbitrarily close to $1$. Furthermore, we extend these results to approximating solutions instead of just their costs. Lastly, we provide experimental results to validate the quality of solutions and the speedup due to the dimensionality reduction. Unlike several previous papers studying this approach in the context of $k$-means and $k$-medians, our dimension bound does not depend on the number of clusters but only on the intrinsic dimensionality of $X$.
△ Less
Submitted 5 July, 2021;
originally announced July 2021.