Skip to main content

Showing 1–10 of 10 results for author: Buhai, R

.
  1. arXiv:2505.17360  [pdf, ps, other

    cs.CC cs.DS

    The Quasi-Polynomial Low-Degree Conjecture is False

    Authors: Rares-Darius Buhai, Jun-Ting Hsieh, Aayush Jain, Pravesh K. Kothari

    Abstract: There is a growing body of work on proving hardness results for average-case estimation problems by bounding the low-degree advantage (LDA) - a quantitative estimate of the closeness of low-degree moments - between a null distribution and a related planted distribution. Such hardness results are now ubiquitous not only for foundational average-case problems but also central questions in statistics… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

  2. arXiv:2505.11093  [pdf, ps, other

    math.ST cs.DS stat.ML

    Lasso and Partially-Rotated Designs

    Authors: Rares-Darius Buhai

    Abstract: We consider the sparse linear regression model $\mathbf{y} = X β+\mathbf{w}$, where $X \in \mathbb{R}^{n \times d}$ is the design, $β\in \mathbb{R}^{d}$ is a $k$-sparse secret, and $\mathbf{w} \sim N(0, I_n)$ is the noise. Given input $X$ and $\mathbf{y}$, the goal is to estimate $β$. In this setting, the Lasso estimate achieves prediction error $O(k \log d / γn)$, where $γ$ is the restricted eige… ▽ More

    Submitted 16 May, 2025; originally announced May 2025.

    Comments: 19 pages

  3. arXiv:2411.12438  [pdf, ps, other

    cs.DS cs.LG stat.ML

    Dimension Reduction via Sum-of-Squares and Improved Clustering Algorithms for Non-Spherical Mixtures

    Authors: Prashanti Anderson, Mitali Bafna, Rares-Darius Buhai, Pravesh K. Kothari, David Steurer

    Abstract: We develop a new approach for clustering non-spherical (i.e., arbitrary component covariances) Gaussian mixture models via a subroutine, based on the sum-of-squares method, that finds a low-dimensional separation-preserving projection of the input data. Our method gives a non-spherical analog of the classical dimension reduction, based on singular value decomposition, that forms a key component of… ▽ More

    Submitted 19 November, 2024; originally announced November 2024.

    Comments: 64 pages

  4. arXiv:2407.15792  [pdf, other

    cs.LG cs.DS stat.ML

    Robust Mixture Learning when Outliers Overwhelm Small Groups

    Authors: Daniil Dmitriev, Rares-Darius Buhai, Stefan Tiegel, Alexander Wolters, Gleb Novikov, Amartya Sanyal, David Steurer, Fanny Yang

    Abstract: We study the problem of estimating the means of well-separated mixtures when an adversary may add arbitrary outliers. While strong guarantees are available when the outlier fraction is significantly smaller than the minimum mixing weight, much less is known when outliers may crowd out low-weight clusters - a setting we refer to as list-decodable mixture learning (LD-ML). In this case, adversarial… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

  5. arXiv:2404.14159  [pdf, ps, other

    cs.DS

    Semirandom Planted Clique and the Restricted Isometry Property

    Authors: Jarosław Błasiok, Rares-Darius Buhai, Pravesh K. Kothari, David Steurer

    Abstract: We give a simple, greedy $O(n^{ω+0.5})=O(n^{2.872})$-time algorithm to list-decode planted cliques in a semirandom model introduced in [CSV17] (following [FK01]) that succeeds whenever the size of the planted clique is $k\geq O(\sqrt{n} \log^2 n)$. In the model, the edges touching the vertices in the planted $k$-clique are drawn independently with probability $p=1/2$ while the edges not touching t… ▽ More

    Submitted 9 October, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: 22 pages, to appear FOCS 2024

  6. arXiv:2402.14103  [pdf, ps, other

    cs.LG cs.CC math.ST stat.ML

    Computational-Statistical Gaps for Improper Learning in Sparse Linear Regression

    Authors: Rares-Darius Buhai, Jingqiu Ding, Stefan Tiegel

    Abstract: We study computational-statistical gaps for improper learning in sparse linear regression. More specifically, given $n$ samples from a $k$-sparse linear model in dimension $d$, we ask what is the minimum sample complexity to efficiently (in time polynomial in $d$, $k$, and $n$) find a potentially dense estimate for the regression vector that achieves non-trivial prediction error on the $n$ samples… ▽ More

    Submitted 25 June, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: 24 pages; updated typos, some explanations, and references

  7. arXiv:2212.05619  [pdf, ps, other

    cs.DS

    Algorithms approaching the threshold for semi-random planted clique

    Authors: Rares-Darius Buhai, Pravesh K. Kothari, David Steurer

    Abstract: We design new polynomial-time algorithms for recovering planted cliques in the semi-random graph model introduced by Feige and Kilian 2001. The previous best algorithms for this model succeed if the planted clique has size at least $n^{2/3}$ in a graph with $n$ vertices (Mehta, Mckenzie, Trevisan 2019 and Charikar, Steinhardt, Valiant 2017). Our algorithms work for planted-clique sizes approaching… ▽ More

    Submitted 6 June, 2023; v1 submitted 11 December, 2022; originally announced December 2022.

    Comments: 51 pages, the arxiv landing page contains a shortened abstract

    ACM Class: F.2

  8. arXiv:2112.05445  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Beyond Parallel Pancakes: Quasi-Polynomial Time Guarantees for Non-Spherical Gaussian Mixtures

    Authors: Rares-Darius Buhai, David Steurer

    Abstract: We consider mixtures of $k\geq 2$ Gaussian components with unknown means and unknown covariance (identical for all components) that are well-separated, i.e., distinct components have statistical overlap at most $k^{-C}$ for a large enough constant $C\ge 1$. Previous statistical-query [DKS17] and lattice-based [BRST21, GVV22] lower bounds give formal evidence that even distinguishing such mixtures… ▽ More

    Submitted 7 June, 2023; v1 submitted 10 December, 2021; originally announced December 2021.

    Comments: 67 pages, the arxiv landing page contains a shortened abstract

  9. arXiv:2006.04166  [pdf, other

    cs.LG cs.DS cs.IT stat.ML

    Learning Restricted Boltzmann Machines with Sparse Latent Variables

    Authors: Guy Bresler, Rares-Darius Buhai

    Abstract: Restricted Boltzmann Machines (RBMs) are a common family of undirected graphical models with latent variables. An RBM is described by a bipartite graph, with all observed variables in one layer and all latent variables in the other. We consider the task of learning an RBM given samples generated according to it. The best algorithms for this task currently have time complexity $\tilde{O}(n^2)$ for… ▽ More

    Submitted 17 October, 2020; v1 submitted 7 June, 2020; originally announced June 2020.

    Comments: 33 pages, to appear at NeurIPS 2020

  10. arXiv:1907.00030  [pdf, other

    stat.ML cs.LG

    Empirical Study of the Benefits of Overparameterization in Learning Latent Variable Models

    Authors: Rares-Darius Buhai, Yoni Halpern, Yoon Kim, Andrej Risteski, David Sontag

    Abstract: One of the most surprising and exciting discoveries in supervised learning was the benefit of overparameterization (i.e. training a very large model) to improving the optimization landscape of a problem, with minimal effect on statistical performance (i.e. generalization). In contrast, unsupervised settings have been under-explored, despite the fact that it was observed that overparameterization c… ▽ More

    Submitted 16 July, 2020; v1 submitted 28 June, 2019; originally announced July 2019.

    Comments: 22 pages, to appear at ICML 2020