Skip to main content

Showing 1–17 of 17 results for author: Tiegel, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.24321  [pdf, ps, other

    cs.DS cs.IT cs.LG stat.ML

    Sample-Optimal Private Regression in Polynomial Time

    Authors: Prashanti Anderson, Ainesh Bakshi, Mahbod Majid, Stefan Tiegel

    Abstract: We consider the task of privately obtaining prediction error guarantees in ordinary least-squares regression problems with Gaussian covariates (with unknown covariance structure). We provide the first sample-optimal polynomial time algorithm for this task under both pure and approximate differential privacy. We show that any improvement to the sample complexity of our algorithm would violate eithe… ▽ More

    Submitted 31 March, 2025; originally announced March 2025.

  2. arXiv:2503.03923  [pdf, ps, other

    cs.DS stat.ML

    Improved Robust Estimation for Erdős-Rényi Graphs: The Sparse Regime and Optimal Breakdown Point

    Authors: Hongjie Chen, Jingqiu Ding, Yiding Hua, Stefan Tiegel

    Abstract: We study the problem of robustly estimating the edge density of Erdős-Rényi random graphs $G(n, d^\circ/n)$ when an adversary can arbitrarily add or remove edges incident to an $η$-fraction of the nodes. We develop the first polynomial-time algorithm for this problem that estimates $d^\circ$ up to an additive error $O([\sqrt{\log(n) / n} + η\sqrt{\log(1/η)} ] \cdot \sqrt{d^\circ} + η\log(1/η))$. O… ▽ More

    Submitted 5 March, 2025; originally announced March 2025.

  3. arXiv:2412.21203  [pdf, ps, other

    cs.DS cs.LG

    SoS Certificates for Sparse Singular Values and Their Applications: Robust Statistics, Subspace Distortion, and More

    Authors: Ilias Diakonikolas, Samuel B. Hopkins, Ankit Pensia, Stefan Tiegel

    Abstract: We study $\textit{sparse singular value certificates}$ for random rectangular matrices. If $M$ is an $n \times d$ matrix with independent Gaussian entries, we give a new family of polynomial-time algorithms which can certify upper bounds on the maximum of $\|M u\|$, where $u$ is a unit vector with at most $ηn$ nonzero entries for a given $η\in (0,1)$. This basic algorithmic primitive lies at the h… ▽ More

    Submitted 30 December, 2024; originally announced December 2024.

  4. arXiv:2411.12512  [pdf, ps, other

    cs.CC cs.CR cs.DM math.ST

    Near-Optimal Time-Sparsity Trade-Offs for Solving Noisy Linear Equations

    Authors: Kiril Bangachev, Guy Bresler, Stefan Tiegel, Vinod Vaikuntanathan

    Abstract: We present a polynomial-time reduction from solving noisy linear equations over $\mathbb{Z}/q\mathbb{Z}$ in dimension $Θ(k\log n/\mathsf{poly}(\log k,\log q,\log\log n))$ with a uniformly random coefficient matrix to noisy linear equations over $\mathbb{Z}/q\mathbb{Z}$ in dimension $n$ where each row of the coefficient matrix has uniformly random support of size $k$. This allows us to deduce the h… ▽ More

    Submitted 19 November, 2024; originally announced November 2024.

    Comments: Abstract shortened to match arXiv requirements

  5. arXiv:2410.21194  [pdf, ps, other

    cs.DS cs.LG math.ST stat.ML

    SoS Certifiability of Subgaussian Distributions and its Algorithmic Applications

    Authors: Ilias Diakonikolas, Samuel B. Hopkins, Ankit Pensia, Stefan Tiegel

    Abstract: We prove that there is a universal constant $C>0$ so that for every $d \in \mathbb N$, every centered subgaussian distribution $\mathcal D$ on $\mathbb R^d$, and every even $p \in \mathbb N$, the $d$-variate polynomial $(Cp)^{p/2} \cdot \|v\|_{2}^p - \mathbb E_{X \sim \mathcal D} \langle v,X\rangle^p$ is a sum of square polynomials. This establishes that every subgaussian distribution is \emph{SoS… ▽ More

    Submitted 28 October, 2024; originally announced October 2024.

  6. arXiv:2407.15792  [pdf, other

    cs.LG cs.DS stat.ML

    Robust Mixture Learning when Outliers Overwhelm Small Groups

    Authors: Daniil Dmitriev, Rares-Darius Buhai, Stefan Tiegel, Alexander Wolters, Gleb Novikov, Amartya Sanyal, David Steurer, Fanny Yang

    Abstract: We study the problem of estimating the means of well-separated mixtures when an adversary may add arbitrary outliers. While strong guarantees are available when the outlier fraction is significantly smaller than the minimum mixing weight, much less is known when outliers may crowd out low-weight clusters - a setting we refer to as list-decodable mixture learning (LD-ML). In this case, adversarial… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

  7. arXiv:2406.06106  [pdf, ps, other

    cs.LG cs.DS

    Testably Learning Polynomial Threshold Functions

    Authors: Lucas Slot, Stefan Tiegel, Manuel Wiedmer

    Abstract: Rubinfeld & Vasilyan recently introduced the framework of testable learning as an extension of the classical agnostic model. It relaxes distributional assumptions which are difficult to verify by conditions that can be checked efficiently by a tester. The tester has to accept whenever the data truly satisfies the original assumptions, and the learner has to succeed whenever the tester accepts. We… ▽ More

    Submitted 6 November, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted to NeurIPS 2024. v2: Minor updates of exposition. 53 pages

  8. arXiv:2402.15995  [pdf, ps, other

    cs.CC cs.LG math.ST stat.ML

    Improved Hardness Results for Learning Intersections of Halfspaces

    Authors: Stefan Tiegel

    Abstract: We show strong (and surprisingly simple) lower bounds for weakly learning intersections of halfspaces in the improper setting. Strikingly little is known about this problem. For instance, it is not even known if there is a polynomial-time algorithm for learning the intersection of only two halfspaces. On the other hand, lower bounds based on well-established assumptions (such as approximating wors… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

  9. arXiv:2402.14103  [pdf, ps, other

    cs.LG cs.CC math.ST stat.ML

    Computational-Statistical Gaps for Improper Learning in Sparse Linear Regression

    Authors: Rares-Darius Buhai, Jingqiu Ding, Stefan Tiegel

    Abstract: We study computational-statistical gaps for improper learning in sparse linear regression. More specifically, given $n$ samples from a $k$-sparse linear model in dimension $d$, we ask what is the minimum sample complexity to efficiently (in time polynomial in $d$, $k$, and $n$) find a potentially dense estimate for the regression vector that achieves non-trivial prediction error on the $n$ samples… ▽ More

    Submitted 25 June, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: 24 pages; updated typos, some explanations, and references

  10. arXiv:2302.10844  [pdf, ps, other

    cs.DS cs.LG stat.ML

    Robust Mean Estimation Without Moments for Symmetric Distributions

    Authors: Gleb Novikov, David Steurer, Stefan Tiegel

    Abstract: We study the problem of robustly estimating the mean or location parameter without moment assumptions. We show that for a large class of symmetric distributions, the same error as in the Gaussian setting can be achieved efficiently. The distributions we study include products of arbitrary symmetric one-dimensional distributions, such as product Cauchy distributions, as well as elliptical distribut… ▽ More

    Submitted 8 November, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

    Comments: Accepted at NeurIPS 2023

  11. arXiv:2301.04822  [pdf, ps, other

    cs.DS cs.CR cs.LG stat.ML

    Private estimation algorithms for stochastic block models and mixture models

    Authors: Hongjie Chen, Vincent Cohen-Addad, Tommaso d'Orsi, Alessandro Epasto, Jacob Imola, David Steurer, Stefan Tiegel

    Abstract: We introduce general tools for designing efficient private estimation algorithms, in the high-dimensional settings, whose statistical guarantees almost match those of the best known non-private algorithms. To illustrate our techniques, we consider two problems: recovery of stochastic block models and learning mixtures of spherical Gaussians. For the former, we present the first efficient $(ε, δ)$-… ▽ More

    Submitted 15 November, 2023; v1 submitted 11 January, 2023; originally announced January 2023.

  12. arXiv:2207.14030  [pdf, ps, other

    cs.LG cs.CC math.ST stat.ML

    Hardness of Agnostically Learning Halfspaces from Worst-Case Lattice Problems

    Authors: Stefan Tiegel

    Abstract: We show hardness of improperly learning halfspaces in the agnostic model, both in the distribution-independent as well as the distribution-specific setting, based on the assumption that worst-case lattice problems, such as GapSVP or SIVP, are hard. In particular, we show that under this assumption there is no efficient algorithm that outputs any binary hypothesis, not necessarily a halfspace, achi… ▽ More

    Submitted 20 February, 2023; v1 submitted 28 July, 2022; originally announced July 2022.

  13. arXiv:2202.06442  [pdf, ps, other

    cs.LG cs.DS

    Fast algorithm for overcomplete order-3 tensor decomposition

    Authors: Jingqiu Ding, Tommaso d'Orsi, Chih-Hung Liu, Stefan Tiegel, David Steurer

    Abstract: We develop the first fast spectral algorithm to decompose a random third-order tensor over $\mathbb{R}^d$ of rank up to $O(d^{3/2}/\text{polylog}(d))$. Our algorithm only involves simple linear algebra operations and can recover all components in time $O(d^{6.05})$ under the current matrix multiplication time. Prior to this work, comparable guarantees could only be achieved via sum-of-squares [M… ▽ More

    Submitted 28 June, 2022; v1 submitted 13 February, 2022; originally announced February 2022.

    Comments: 59 pages, accepted by COLT 2022

  14. arXiv:2201.09818  [pdf, ps, other

    cs.LG cs.CC math.ST stat.ML

    Optimal SQ Lower Bounds for Learning Halfspaces with Massart Noise

    Authors: Rajai Nasser, Stefan Tiegel

    Abstract: We give tight statistical query (SQ) lower bounds for learnining halfspaces in the presence of Massart noise. In particular, suppose that all labels are corrupted with probability at most $η$. We show that for arbitrary $η\in [0,1/2]$ every SQ algorithm achieving misclassification error better than $η$ requires queries of superpolynomial accuracy or at least a superpolynomial number of queries. Fu… ▽ More

    Submitted 24 January, 2022; originally announced January 2022.

  15. arXiv:2111.02966  [pdf, ps, other

    cs.LG stat.ML

    Consistent Estimation for PCA and Sparse Regression with Oblivious Outliers

    Authors: Tommaso d'Orsi, Chih-Hung Liu, Rajai Nasser, Gleb Novikov, David Steurer, Stefan Tiegel

    Abstract: We develop machinery to design efficiently computable and consistent estimators, achieving estimation error approaching zero as the number of observations grows, when facing an oblivious adversary that may corrupt responses in all but an $α$ fraction of the samples. As concrete examples, we investigate two problems: sparse regression and principal component analysis (PCA). For sparse regression, w… ▽ More

    Submitted 4 November, 2021; originally announced November 2021.

    Comments: To appear in NeurIPS 2021

  16. arXiv:2101.01509  [pdf, ps, other

    cs.DS cs.LG stat.ML

    SoS Degree Reduction with Applications to Clustering and Robust Moment Estimation

    Authors: David Steurer, Stefan Tiegel

    Abstract: We develop a general framework to significantly reduce the degree of sum-of-squares proofs by introducing new variables. To illustrate the power of this framework, we use it to speed up previous algorithms based on sum-of-squares for two important estimation problems, clustering and robust moment estimation. The resulting algorithms offer the same statistical guarantees as the previous best algori… ▽ More

    Submitted 5 January, 2021; originally announced January 2021.

    Comments: 32 pages

  17. arXiv:1804.02075  [pdf, other

    cs.DS cs.IT cs.LG

    A Framework for Searching in Graphs in the Presence of Errors

    Authors: Dariusz Dereniowski, Stefan Tiegel, Przemysław Uznański, Daniel Wolleb-Graf

    Abstract: We consider the problem of searching for an unknown target vertex $t$ in a (possibly edge-weighted) graph. Each \emph{vertex-query} points to a vertex $v$ and the response either admits $v$ is the target or provides any neighbor $s\not=v$ that lies on a shortest path from $v$ to $t$. This model has been introduced for trees by Onak and Parys [FOCS 2006] and for general graphs by Emamjomeh-Zadeh et… ▽ More

    Submitted 5 March, 2020; v1 submitted 5 April, 2018; originally announced April 2018.

    Journal ref: SOSA@SODA 2019: 4:1-4:17