-
Adaptive Robustness of Hypergrid Johnson-Lindenstrauss
Authors:
Andrej Bogdanov,
Alon Rosen,
Neekon Vafa,
Vinod Vaikuntanathan
Abstract:
Johnson and Lindenstrauss (Contemporary Mathematics, 1984) showed that for $n > m$, a scaled random projection $\mathbf{A}$ from $\mathbb{R}^n$ to $\mathbb{R}^m$ is an approximate isometry on any set $S$ of size at most exponential in $m$. If $S$ is larger, however, its points can contract arbitrarily under $\mathbf{A}$. In particular, the hypergrid $([-B, B] \cap \mathbb{Z})^n$ is expected to con…
▽ More
Johnson and Lindenstrauss (Contemporary Mathematics, 1984) showed that for $n > m$, a scaled random projection $\mathbf{A}$ from $\mathbb{R}^n$ to $\mathbb{R}^m$ is an approximate isometry on any set $S$ of size at most exponential in $m$. If $S$ is larger, however, its points can contract arbitrarily under $\mathbf{A}$. In particular, the hypergrid $([-B, B] \cap \mathbb{Z})^n$ is expected to contain a point that is contracted by a factor of $κ_{\mathsf{stat}} = Θ(B)^{-1/α}$, where $α= m/n$.
We give evidence that finding such a point exhibits a statistical-computational gap precisely up to $κ_{\mathsf{comp}} = \widetildeΘ(\sqrtα/B)$. On the algorithmic side, we design an online algorithm achieving $κ_{\mathsf{comp}}$, inspired by a discrepancy minimization algorithm of Bansal and Spencer (Random Structures & Algorithms, 2020). On the hardness side, we show evidence via a multiple overlap gap property (mOGP), which in particular captures online algorithms; and a reduction-based lower bound, which shows hardness under standard worst-case lattice assumptions.
As a cryptographic application, we show that the rounded Johnson-Lindenstrauss embedding is a robust property-preserving hash function (Boyle, Lavigne and Vaikuntanathan, TCC 2019) on the hypergrid for the Euclidean metric in the computationally hard regime. Such hash functions compress data while preserving $\ell_2$ distances between inputs up to some distortion factor, with the guarantee that even knowing the hash function, no computationally bounded adversary can find any pair of points that violates the distortion bound.
△ Less
Submitted 12 April, 2025;
originally announced April 2025.
-
Stochastic modeling of in vitro bactericidal potency
Authors:
Anita Bogdanov,
Péter Kevei,
Máté Szalai,
Dezső Virok
Abstract:
We provide a Galton--Watson model for the growth of a bacterial population in the presence of antibiotics. We assume that bacterial cells either die or duplicate, and the corresponding probabilities depend on the concentration of the antibiotic. Assuming that the mean offspring number is given by $m(c) = 2 / (1 + αc^β)$ for some $α, β$, where $c$ stands for the antibiotic concentration we obtain w…
▽ More
We provide a Galton--Watson model for the growth of a bacterial population in the presence of antibiotics. We assume that bacterial cells either die or duplicate, and the corresponding probabilities depend on the concentration of the antibiotic. Assuming that the mean offspring number is given by $m(c) = 2 / (1 + αc^β)$ for some $α, β$, where $c$ stands for the antibiotic concentration we obtain weakly consistent, asymptotically normal estimator both for $(α, β)$ and for the minimal inhibitory concentration (MIC), a relevant parameter in pharmacology. We apply our method to real data, where \emph{Chlamydia trachomatis} bacteria was treated by azithromycin and ciprofloxacin. For the measurements of \emph{Chlamydia} growth quantitative PCR technique was used. The 2-parameter model fits remarkably well to the biological data.
△ Less
Submitted 23 April, 2021;
originally announced April 2021.
-
Learning and Testing Variable Partitions
Authors:
Andrej Bogdanov,
Baoxiang Wang
Abstract:
$ $Let $F$ be a multivariate function from a product set $Σ^n$ to an Abelian group $G$. A $k$-partition of $F$ with cost $δ$ is a partition of the set of variables $\mathbf{V}$ into $k$ non-empty subsets $(\mathbf{X}_1, \dots, \mathbf{X}_k)$ such that $F(\mathbf{V})$ is $δ$-close to $F_1(\mathbf{X}_1)+\dots+F_k(\mathbf{X}_k)$ for some $F_1, \dots, F_k…
▽ More
$ $Let $F$ be a multivariate function from a product set $Σ^n$ to an Abelian group $G$. A $k$-partition of $F$ with cost $δ$ is a partition of the set of variables $\mathbf{V}$ into $k$ non-empty subsets $(\mathbf{X}_1, \dots, \mathbf{X}_k)$ such that $F(\mathbf{V})$ is $δ$-close to $F_1(\mathbf{X}_1)+\dots+F_k(\mathbf{X}_k)$ for some $F_1, \dots, F_k$ with respect to a given error metric. We study algorithms for agnostically learning $k$ partitions and testing $k$-partitionability over various groups and error metrics given query access to $F$. In particular we show that
$1.$ Given a function that has a $k$-partition of cost $δ$, a partition of cost $\mathcal{O}(k n^2)(δ+ ε)$ can be learned in time $\tilde{\mathcal{O}}(n^2 \mathrm{poly} (1/ε))$ for any $ε> 0$. In contrast, for $k = 2$ and $n = 3$ learning a partition of cost $δ+ ε$ is NP-hard.
$2.$ When $F$ is real-valued and the error metric is the 2-norm, a 2-partition of cost $\sqrt{δ^2 + ε}$ can be learned in time $\tilde{\mathcal{O}}(n^5/ε^2)$.
$3.$ When $F$ is $\mathbb{Z}_q$-valued and the error metric is Hamming weight, $k$-partitionability is testable with one-sided error and $\mathcal{O}(kn^3/ε)$ non-adaptive queries. We also show that even two-sided testers require $Ω(n)$ queries when $k = 2$.
This work was motivated by reinforcement learning control tasks in which the set of control variables can be partitioned. The partitioning reduces the task into multiple lower-dimensional ones that are relatively easier to learn. Our second algorithm empirically increases the scores attained over previous heuristic partitioning methods applied in this context.
△ Less
Submitted 29 March, 2020;
originally announced March 2020.
-
The Computational Complexity of Estimating Convergence Time
Authors:
Nayantara Bhatnagar,
Andrej Bogdanov,
Elchanan Mossel
Abstract:
An important problem in the implementation of Markov Chain Monte Carlo algorithms is to determine the convergence time, or the number of iterations before the chain is close to stationarity. For many Markov chains used in practice this time is not known. Even in cases where the convergence time is known to be polynomial, the theoretical bounds are often too crude to be practical. Thus, practitione…
▽ More
An important problem in the implementation of Markov Chain Monte Carlo algorithms is to determine the convergence time, or the number of iterations before the chain is close to stationarity. For many Markov chains used in practice this time is not known. Even in cases where the convergence time is known to be polynomial, the theoretical bounds are often too crude to be practical. Thus, practitioners like to carry out some form of statistical analysis in order to assess convergence. This has led to the development of a number of methods known as convergence diagnostics which attempt to diagnose whether the Markov chain is far from stationarity. We study the problem of testing convergence in the following settings and prove that the problem is hard in a computational sense: Given a Markov chain that mixes rapidly, it is hard for Statistical Zero Knowledge (SZK-hard) to distinguish whether starting from a given state, the chain is close to stationarity by time t or far from stationarity at time ct for a constant c. We show the problem is in AM intersect coAM. Second, given a Markov chain that mixes rapidly it is coNP-hard to distinguish whether it is close to stationarity by time t or far from stationarity at time ct for a constant c. The problem is in coAM. Finally, it is PSPACE-complete to distinguish whether the Markov chain is close to stationarity by time t or far from being mixed at time ct for c at least 1.
△ Less
Submitted 1 July, 2010;
originally announced July 2010.