Skip to main content

Showing 1–50 of 105 results for author: Diakonikolas, I

Searching in archive math. Search in all archives.
.
  1. arXiv:2504.15251  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    On Learning Parallel Pancakes with Mostly Uniform Weights

    Authors: Ilias Diakonikolas, Daniel M. Kane, Sushrut Karmalkar, Jasper C. H. Lee, Thanasis Pittas

    Abstract: We study the complexity of learning $k$-mixtures of Gaussians ($k$-GMMs) on $\mathbb{R}^d$. This task is known to have complexity $d^{Ω(k)}$ in full generality. To circumvent this exponential lower bound on the number of components, research has focused on learning families of GMMs satisfying additional structural properties. A natural assumption posits that the component weights are not exponenti… ▽ More

    Submitted 21 April, 2025; originally announced April 2025.

  2. arXiv:2503.09802  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    Batch List-Decodable Linear Regression via Higher Moments

    Authors: Ilias Diakonikolas, Daniel M. Kane, Sushrut Karmalkar, Sihan Liu, Thanasis Pittas

    Abstract: We study the task of list-decodable linear regression using batches. A batch is called clean if it consists of i.i.d. samples from an unknown linear regression distribution. For a parameter $α\in (0, 1/2)$, an unknown $α$-fraction of the batches are clean and no assumptions are made on the remaining ones. The goal is to output a small list of vectors at least one of which is close to the true regr… ▽ More

    Submitted 12 March, 2025; originally announced March 2025.

  3. arXiv:2502.14772  [pdf, other

    cs.DS cs.LG math.ST stat.ML

    Efficient Multivariate Robust Mean Estimation Under Mean-Shift Contamination

    Authors: Ilias Diakonikolas, Giannis Iakovidis, Daniel M. Kane, Thanasis Pittas

    Abstract: We study the algorithmic problem of robust mean estimation of an identity covariance Gaussian in the presence of mean-shift contamination. In this contamination model, we are given a set of points in $\mathbb{R}^d$ generated i.i.d. via the following process. For a parameter $α<1/2$, the $i$-th sample $x_i$ is obtained as follows: with probability $1-α$, $x_i$ is drawn from $\mathcal{N}(μ, I)$, whe… ▽ More

    Submitted 20 February, 2025; originally announced February 2025.

  4. arXiv:2502.09525  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    Robust Learning of Multi-index Models via Iterative Subspace Approximation

    Authors: Ilias Diakonikolas, Giannis Iakovidis, Daniel M. Kane, Nikos Zarifis

    Abstract: We study the task of learning Multi-Index Models (MIMs) with label noise under the Gaussian distribution. A $K$-MIM is any function $f$ that only depends on a $K$-dimensional subspace. We focus on well-behaved MIMs with finite ranges that satisfy certain regularity properties. Our main contribution is a general robust learner that is qualitatively optimal in the Statistical Query (SQ) model. Our a… ▽ More

    Submitted 14 April, 2025; v1 submitted 13 February, 2025; originally announced February 2025.

  5. arXiv:2502.08611  [pdf, other

    cs.LG math.OC math.ST

    Robustly Learning Monotone Generalized Linear Models via Data Augmentation

    Authors: Nikos Zarifis, Puqian Wang, Ilias Diakonikolas, Jelena Diakonikolas

    Abstract: We study the task of learning Generalized Linear models (GLMs) in the agnostic model under the Gaussian distribution. We give the first polynomial-time algorithm that achieves a constant-factor approximation for \textit{any} monotone Lipschitz activation. Prior constant-factor GLM learners succeed for a substantially smaller class of activations. Our work resolves a well-known open problem, by dev… ▽ More

    Submitted 12 February, 2025; originally announced February 2025.

  6. arXiv:2501.09691  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    A Near-optimal Algorithm for Learning Margin Halfspaces with Massart Noise

    Authors: Ilias Diakonikolas, Nikos Zarifis

    Abstract: We study the problem of PAC learning $γ$-margin halfspaces in the presence of Massart noise. Without computational considerations, the sample complexity of this learning problem is known to be $\widetildeΘ(1/(γ^2 ε))$. Prior computationally efficient algorithms for the problem incur sample complexity $\tilde{O}(1/(γ^4 ε^3))$ and achieve 0-1 error of $η+ε$, where $η<1/2$ is the upper bound on the n… ▽ More

    Submitted 16 January, 2025; originally announced January 2025.

  7. arXiv:2501.05425  [pdf, other

    cs.DS cs.LG math.ST stat.ML

    Entangled Mean Estimation in High-Dimensions

    Authors: Ilias Diakonikolas, Daniel M. Kane, Sihan Liu, Thanasis Pittas

    Abstract: We study the task of high-dimensional entangled mean estimation in the subset-of-signals model. Specifically, given $N$ independent random points $x_1,\ldots,x_N$ in $\mathbb{R}^D$ and a parameter $α\in (0, 1)$ such that each $x_i$ is drawn from a Gaussian with mean $μ$ and unknown covariance, and an unknown $α$-fraction of the points have identity-bounded covariances, the goal is to estimate the… ▽ More

    Submitted 9 January, 2025; originally announced January 2025.

  8. arXiv:2411.15669  [pdf, other

    cs.DS cs.LG math.ST stat.ML

    Implicit High-Order Moment Tensor Estimation and Learning Latent Variable Models

    Authors: Ilias Diakonikolas, Daniel M. Kane

    Abstract: We study the task of learning latent-variable models. A common algorithmic technique for this task is the method of moments. Unfortunately, moment-based approaches are hampered by the fact that the moment tensors of super-constant degree cannot even be written down in polynomial time. Motivated by such learning applications, we develop a general efficient algorithm for {\em implicit moment tensor… ▽ More

    Submitted 12 April, 2025; v1 submitted 23 November, 2024; originally announced November 2024.

    Comments: Abstract shortened due to arxiv requirements

  9. arXiv:2411.06697  [pdf, ps, other

    cs.LG cs.DS math.OC stat.ML

    Learning a Single Neuron Robustly to Distributional Shifts and Adversarial Label Noise

    Authors: Shuyao Li, Sushrut Karmalkar, Ilias Diakonikolas, Jelena Diakonikolas

    Abstract: We study the problem of learning a single neuron with respect to the $L_2^2$-loss in the presence of adversarial distribution shifts, where the labels can be arbitrary, and the goal is to find a ``best-fit'' function. More precisely, given training samples from a reference distribution $\mathcal{p}_0$, the goal is to approximate the vector $\mathbf{w}^*$ which minimizes the squared loss with respe… ▽ More

    Submitted 10 November, 2024; originally announced November 2024.

  10. arXiv:2410.21194  [pdf, ps, other

    cs.DS cs.LG math.ST stat.ML

    SoS Certifiability of Subgaussian Distributions and its Algorithmic Applications

    Authors: Ilias Diakonikolas, Samuel B. Hopkins, Ankit Pensia, Stefan Tiegel

    Abstract: We prove that there is a universal constant $C>0$ so that for every $d \in \mathbb N$, every centered subgaussian distribution $\mathcal D$ on $\mathbb R^d$, and every even $p \in \mathbb N$, the $d$-variate polynomial $(Cp)^{p/2} \cdot \|v\|_{2}^p - \mathbb E_{X \sim \mathcal D} \langle v,X\rangle^p$ is a sum of square polynomials. This establishes that every subgaussian distribution is \emph{SoS… ▽ More

    Submitted 28 October, 2024; originally announced October 2024.

  11. arXiv:2405.12958  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    Online Learning of Halfspaces with Massart Noise

    Authors: Ilias Diakonikolas, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis

    Abstract: We study the task of online learning in the presence of Massart noise. Instead of assuming that the online adversary chooses an arbitrary sequence of labels, we assume that the context $\mathbf{x}$ is selected adversarially but the label $y$ presented to the learner disagrees with the ground-truth label of $\mathbf{x}$ with unknown probability at most $η$. We study the fundamental class of $γ$-mar… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  12. arXiv:2403.10547  [pdf, ps, other

    math.OC cs.AI cs.DS cs.LG

    Robust Second-Order Nonconvex Optimization and Its Application to Low Rank Matrix Sensing

    Authors: Shuyao Li, Yu Cheng, Ilias Diakonikolas, Jelena Diakonikolas, Rong Ge, Stephen J. Wright

    Abstract: Finding an approximate second-order stationary point (SOSP) is a well-studied and fundamental problem in stochastic nonconvex optimization with many applications in machine learning. However, this problem is poorly understood in the presence of outliers, limiting the use of existing nonconvex algorithms in adversarial settings. In this paper, we study the problem of finding SOSPs in the strong c… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  13. arXiv:2403.10416  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    Robust Sparse Estimation for Gaussians with Optimal Error under Huber Contamination

    Authors: Ilias Diakonikolas, Daniel M. Kane, Sushrut Karmalkar, Ankit Pensia, Thanasis Pittas

    Abstract: We study Gaussian sparse estimation tasks in Huber's contamination model with a focus on mean estimation, PCA, and linear regression. For each of these tasks, we give the first sample and computationally efficient robust estimators with optimal error guarantees, within constant factors. All prior efficient algorithms for these tasks incur quantitatively suboptimal error. Concretely, for Gaussian r… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  14. arXiv:2403.04744  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    SQ Lower Bounds for Non-Gaussian Component Analysis with Weaker Assumptions

    Authors: Ilias Diakonikolas, Daniel Kane, Lisheng Ren, Yuxin Sun

    Abstract: We study the complexity of Non-Gaussian Component Analysis (NGCA) in the Statistical Query (SQ) model. Prior work developed a general methodology to prove SQ lower bounds for this task that have been applicable to a wide range of contexts. In particular, it was known that for any univariate distribution $A$ satisfying certain conditions, distinguishing between a standard multivariate Gaussian and… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: Conference version published in NeurIPS 2023

  15. arXiv:2403.02300  [pdf, other

    cs.DS cs.LG math.ST stat.ML

    Statistical Query Lower Bounds for Learning Truncated Gaussians

    Authors: Ilias Diakonikolas, Daniel M. Kane, Thanasis Pittas, Nikos Zarifis

    Abstract: We study the problem of estimating the mean of an identity covariance Gaussian in the truncated setting, in the regime when the truncation set comes from a low-complexity family $\mathcal{C}$ of sets. Specifically, for a fixed but unknown truncation set $S \subseteq \mathbb{R}^d$, we are given access to samples from the distribution $\mathcal{N}(\boldsymbol{ μ}, \mathbf{ I})$ truncated to the set… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  16. arXiv:2402.17756  [pdf, other

    cs.LG cs.DS math.OC math.ST stat.ML

    Robustly Learning Single-Index Models via Alignment Sharpness

    Authors: Nikos Zarifis, Puqian Wang, Ilias Diakonikolas, Jelena Diakonikolas

    Abstract: We study the problem of learning Single-Index Models under the $L_2^2$ loss in the agnostic model. We give an efficient learning algorithm, achieving a constant factor approximation to the optimal loss, that succeeds under a range of distributions (including log-concave distributions) and a broad class of monotone and Lipschitz link functions. This is the first efficient constant factor approximat… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  17. arXiv:2312.16616  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    Agnostically Learning Multi-index Models with Queries

    Authors: Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis

    Abstract: We study the power of query access for the task of agnostic learning under the Gaussian distribution. In the agnostic model, no assumptions are made on the labels and the goal is to compute a hypothesis that is competitive with the {\em best-fit} function in a known class, i.e., it achieves error $\mathrm{opt}+ε$, where $\mathrm{opt}$ is the error of the best function in the class. We focus on a g… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

    Comments: abstract shortened due to arxiv requirements

  18. arXiv:2312.11769  [pdf, other

    cs.LG cs.DS cs.IT math.ST stat.ML

    Clustering Mixtures of Bounded Covariance Distributions Under Optimal Separation

    Authors: Ilias Diakonikolas, Daniel M. Kane, Jasper C. H. Lee, Thanasis Pittas

    Abstract: We study the clustering problem for mixtures of bounded covariance distributions, under a fine-grained separation assumption. Specifically, given samples from a $k$-component mixture distribution $D = \sum_{i =1}^k w_i P_i$, where each $w_i \ge α$ for some known parameter $α$, and each $P_i$ has unknown covariance $Σ_i \preceq σ^2_i \cdot I_d$ for some unknown $σ_i$, the goal is to cluster the sam… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  19. arXiv:2311.13154  [pdf, other

    cs.DS cs.IT cs.LG math.ST stat.ML

    Testing Closeness of Multivariate Distributions via Ramsey Theory

    Authors: Ilias Diakonikolas, Daniel M. Kane, Sihan Liu

    Abstract: We investigate the statistical task of closeness (or equivalence) testing for multidimensional distributions. Specifically, given sample access to two unknown distributions $\mathbf p, \mathbf q$ on $\mathbb R^d$, we want to distinguish between the case that $\mathbf p=\mathbf q$ versus $\|\mathbf p-\mathbf q\|_{A_k} > ε$, where $\|\mathbf p-\mathbf q\|_{A_k}$ denotes the generalized ${A}_k$ dista… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

  20. arXiv:2310.15932  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    Online Robust Mean Estimation

    Authors: Daniel M. Kane, Ilias Diakonikolas, Hanshen Xiao, Sihan Liu

    Abstract: We study the problem of high-dimensional robust mean estimation in an online setting. Specifically, we consider a scenario where $n$ sensors are measuring some common, ongoing phenomenon. At each time step $t=1,2,\ldots,T$, the $i^{th}$ sensor reports its readings $x^{(i)}_t$ for that time step. The algorithm must then commit to its estimate $μ_t$ for the true mean value of the process at time… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: To appear in SODA2024

  21. arXiv:2310.11876  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    SQ Lower Bounds for Learning Mixtures of Linear Classifiers

    Authors: Ilias Diakonikolas, Daniel M. Kane, Yuxin Sun

    Abstract: We study the problem of learning mixtures of linear classifiers under Gaussian covariates. Given sample access to a mixture of $r$ distributions on $\mathbb{R}^n$ of the form $(\mathbf{x},y_{\ell})$, $\ell\in [r]$, where $\mathbf{x}\sim\mathcal{N}(0,\mathbf{I}_n)$ and $y_\ell=\mathrm{sign}(\langle\mathbf{v}_\ell,\mathbf{x}\rangle)$ for an unknown unit vector $\mathbf{v}_\ell$, the goal is to learn… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: To appear in NeurIPS 2023

  22. arXiv:2309.11657  [pdf, other

    cs.DS cs.LG math.ST stat.ML

    Distribution-Independent Regression for Generalized Linear Models with Oblivious Corruptions

    Authors: Ilias Diakonikolas, Sushrut Karmalkar, Jongho Park, Christos Tzamos

    Abstract: We demonstrate the first algorithms for the problem of regression for generalized linear models (GLMs) in the presence of additive oblivious noise. We assume we have sample access to examples $(x, y)$ where $y$ is a noisy measurement of $g(w^* \cdot x)$. In particular, \new{the noisy labels are of the form} $y = g(w^* \cdot x) + ξ+ ε$, where $ξ$ is the oblivious noise drawn independently of $x$ \n… ▽ More

    Submitted 27 September, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

    Comments: Published in COLT 2023

  23. arXiv:2308.03142  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    Self-Directed Linear Classification

    Authors: Ilias Diakonikolas, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis

    Abstract: In online classification, a learner is presented with a sequence of examples and aims to predict their labels in an online fashion so as to minimize the total number of mistakes. In the self-directed variant, the learner knows in advance the pool of examples and can adaptively choose the order in which predictions are made. Here we study the power of choosing the prediction order and establish the… ▽ More

    Submitted 6 August, 2023; originally announced August 2023.

  24. arXiv:2307.12840  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials

    Authors: Ilias Diakonikolas, Daniel M. Kane

    Abstract: We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $\mathbb{R}^d$ with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/ε)^{O(k)}$, where $ε>0$ is the target accuracy. Prior work had given an algorithm for this problem with complexity… ▽ More

    Submitted 25 July, 2023; v1 submitted 24 July, 2023; originally announced July 2023.

  25. arXiv:2307.08438  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    Near-Optimal Bounds for Learning Gaussian Halfspaces with Random Classification Noise

    Authors: Ilias Diakonikolas, Jelena Diakonikolas, Daniel M. Kane, Puqian Wang, Nikos Zarifis

    Abstract: We study the problem of learning general (i.e., not necessarily homogeneous) halfspaces with Random Classification Noise under the Gaussian distribution. We establish nearly-matching algorithmic and Statistical Query (SQ) lower bound results revealing a surprising information-computation gap for this basic problem. Specifically, the sample complexity of this learning problem is $\widetildeΘ(d/ε)$,… ▽ More

    Submitted 13 July, 2023; originally announced July 2023.

  26. arXiv:2306.16352  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    Information-Computation Tradeoffs for Learning Margin Halfspaces with Random Classification Noise

    Authors: Ilias Diakonikolas, Jelena Diakonikolas, Daniel M. Kane, Puqian Wang, Nikos Zarifis

    Abstract: We study the problem of PAC learning $γ$-margin halfspaces with Random Classification Noise. We establish an information-computation tradeoff suggesting an inherent gap between the sample complexity of the problem and the sample complexity of computationally efficient algorithms. Concretely, the sample complexity of the problem is $\widetildeΘ(1/(γ^2 ε))$. We start by giving a simple efficient alg… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

  27. arXiv:2306.13057  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    SQ Lower Bounds for Learning Bounded Covariance GMMs

    Authors: Ilias Diakonikolas, Daniel M. Kane, Thanasis Pittas, Nikos Zarifis

    Abstract: We study the complexity of learning mixtures of separated Gaussians with common unknown bounded covariance matrix. Specifically, we focus on learning Gaussian mixture models (GMMs) on $\mathbb{R}^d$ of the form $P= \sum_{i=1}^k w_i \mathcal{N}(\boldsymbol μ_i,\mathbf Σ_i)$, where $\mathbf Σ_i = \mathbf Σ\preceq \mathbf I$ and $\min_{i \neq j} \| \boldsymbol μ_i - \boldsymbol μ_j\|_2 \geq k^ε$ for… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

  28. arXiv:2306.07892  [pdf, other

    cs.LG cs.DS math.OC math.ST stat.ML

    Robustly Learning a Single Neuron via Sharpness

    Authors: Puqian Wang, Nikos Zarifis, Ilias Diakonikolas, Jelena Diakonikolas

    Abstract: We study the problem of learning a single neuron with respect to the $L_2^2$-loss in the presence of adversarial label noise. We give an efficient algorithm that, for a broad family of activations including ReLUs, approximates the optimal $L_2^2$-error within a constant factor. Our algorithm applies under much milder distributional assumptions compared to prior work. The key ingredient enabling ou… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

  29. arXiv:2305.02544  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    Nearly-Linear Time and Streaming Algorithms for Outlier-Robust PCA

    Authors: Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia, Thanasis Pittas

    Abstract: We study principal component analysis (PCA), where given a dataset in $\mathbb{R}^d$ from a distribution, the task is to find a unit vector $v$ that approximately maximizes the variance of the distribution after being projected along $v$. Despite being a classical task, standard estimators fail drastically if the data contains even a small fraction of outliers, motivating the problem of robust PCA… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: To appear in ICML 2023

  30. arXiv:2305.00966  [pdf, other

    cs.DS cs.LG math.ST stat.ML

    A Spectral Algorithm for List-Decodable Covariance Estimation in Relative Frobenius Norm

    Authors: Ilias Diakonikolas, Daniel M. Kane, Jasper C. H. Lee, Ankit Pensia, Thanasis Pittas

    Abstract: We study the problem of list-decodable Gaussian covariance estimation. Given a multiset $T$ of $n$ points in $\mathbb R^d$ such that an unknown $α<1/2$ fraction of points in $T$ are i.i.d. samples from an unknown Gaussian $\mathcal{N}(μ, Σ)$, the goal is to output a list of $O(1/α)$ hypotheses at least one of which is close to $Σ$ in relative Frobenius norm. Our main result is a… ▽ More

    Submitted 1 May, 2023; originally announced May 2023.

  31. arXiv:2212.11221  [pdf, ps, other

    math.PR cs.DS cs.LG math.ST stat.ML

    A Nearly Tight Bound for Fitting an Ellipsoid to Gaussian Random Points

    Authors: Daniel M. Kane, Ilias Diakonikolas

    Abstract: We prove that for $c>0$ a sufficiently small universal constant that a random set of $c d^2/\log^4(d)$ independent Gaussian random points in $\mathbb{R}^d$ lie on a common ellipsoid with high probability. This nearly establishes a conjecture of~\cite{SaundersonCPW12}, within logarithmic factors. The latter conjecture has attracted significant attention over the past decade, due to its connections… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

  32. arXiv:2211.16333  [pdf, ps, other

    cs.DS cs.LG math.ST stat.ML

    Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions

    Authors: Ilias Diakonikolas, Daniel M. Kane, Jasper C. H. Lee, Ankit Pensia

    Abstract: We study the fundamental task of outlier-robust mean estimation for heavy-tailed distributions in the presence of sparsity. Specifically, given a small number of corrupted samples from a high-dimensional heavy-tailed distribution whose mean $μ$ is guaranteed to be sparse, the goal is to efficiently compute a hypothesis that accurately approximates $μ$ with high probability. Prior work had obtained… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

    Comments: To appear in NeurIPS 2022

  33. arXiv:2210.13706  [pdf, ps, other

    math.ST cs.DS cs.LG stat.ML

    Gaussian Mean Testing Made Simple

    Authors: Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia

    Abstract: We study the following fundamental hypothesis testing problem, which we term Gaussian mean testing. Given i.i.d. samples from a distribution $p$ on $\mathbb{R}^d$, the task is to distinguish, with high probability, between the following cases: (i) $p$ is the standard Gaussian distribution, $\mathcal{N}(0,I_d)$, and (ii) $p$ is a Gaussian $\mathcal{N}(μ,Σ)$ for some unknown covariance $Σ$ and mean… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: To appear in SIAM Symposium on Simplicity in Algorithms (SOSA) 2023

  34. arXiv:2210.09949  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    SQ Lower Bounds for Learning Single Neurons with Massart Noise

    Authors: Ilias Diakonikolas, Daniel M. Kane, Lisheng Ren, Yuxin Sun

    Abstract: We study the problem of PAC learning a single neuron in the presence of Massart noise. Specifically, for a known activation function $f: \mathbb{R} \to \mathbb{R}$, the learner is given access to labeled examples $(\mathbf{x}, y) \in \mathbb{R}^d \times \mathbb{R}$, where the marginal distribution of $\mathbf{x}$ is arbitrary and the corresponding label $y$ is a Massart corruption of… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

    Comments: To appear in NeurIPS 2022

  35. arXiv:2207.06596  [pdf, other

    cs.DS cs.LG math.ST

    Near-Optimal Bounds for Testing Histogram Distributions

    Authors: Clément L. Canonne, Ilias Diakonikolas, Daniel M. Kane, Sihan Liu

    Abstract: We investigate the problem of testing whether a discrete probability distribution over an ordered domain is a histogram on a specified number of bins. One of the most common tools for the succinct approximation of data, $k$-histograms over $[n]$, are probability distributions that are piecewise constant over a set of $k$ intervals. The histogram testing problem is the following: Given samples from… ▽ More

    Submitted 13 July, 2022; originally announced July 2022.

  36. arXiv:2206.08918  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    Learning a Single Neuron with Adversarial Label Noise via Gradient Descent

    Authors: Ilias Diakonikolas, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis

    Abstract: We study the fundamental problem of learning a single neuron, i.e., a function of the form $\mathbf{x}\mapstoσ(\mathbf{w}\cdot\mathbf{x})$ for monotone activations $σ:\mathbb{R}\mapsto\mathbb{R}$, with respect to the $L_2^2$-loss in the presence of adversarial label noise. Specifically, we are given labeled examples from a distribution $D$ on $(\mathbf{x}, y)\in\mathbb{R}^d \times \mathbb{R}$ such… ▽ More

    Submitted 17 June, 2022; originally announced June 2022.

  37. arXiv:2206.05245  [pdf, other

    cs.DS cs.LG math.ST stat.ML

    List-Decodable Sparse Mean Estimation via Difference-of-Pairs Filtering

    Authors: Ilias Diakonikolas, Daniel M. Kane, Sushrut Karmalkar, Ankit Pensia, Thanasis Pittas

    Abstract: We study the problem of list-decodable sparse mean estimation. Specifically, for a parameter $α\in (0, 1/2)$, we are given $m$ points in $\mathbb{R}^n$, $\lfloor αm \rfloor$ of which are i.i.d. samples from a distribution $D$ with unknown $k$-sparse mean $μ$. No assumptions are made on the remaining points, which form the majority of the dataset. The goal is to return a small list of candidates co… ▽ More

    Submitted 5 July, 2024; v1 submitted 10 June, 2022; originally announced June 2022.

    Comments: Added fact about taking roots in SoS proofs (Fact 2.9)

  38. arXiv:2206.04589  [pdf, ps, other

    cs.DS cs.LG math.ST stat.ML

    Optimal SQ Lower Bounds for Robustly Learning Discrete Product Distributions and Ising Models

    Authors: Ilias Diakonikolas, Daniel M. Kane, Yuxin Sun

    Abstract: We establish optimal Statistical Query (SQ) lower bounds for robustly learning certain families of discrete high-dimensional distributions. In particular, we show that no efficient SQ algorithm with access to an $ε$-corrupted binary product distribution can learn its mean within $\ell_2$-error $o(ε\sqrt{\log(1/ε)})$. Similarly, we show that no efficient SQ algorithm with access to an $ε$-corrupted… ▽ More

    Submitted 9 June, 2022; originally announced June 2022.

    Comments: To appear in COLT 2022

  39. arXiv:2206.03441  [pdf, other

    cs.DS cs.LG math.ST stat.ML

    Robust Sparse Mean Estimation via Sum of Squares

    Authors: Ilias Diakonikolas, Daniel M. Kane, Sushrut Karmalkar, Ankit Pensia, Thanasis Pittas

    Abstract: We study the problem of high-dimensional sparse mean estimation in the presence of an $ε$-fraction of adversarial outliers. Prior work obtained sample and computationally efficient algorithms for this task for identity-covariance subgaussian distributions. In this work, we develop the first efficient algorithms for robust sparse mean estimation without a priori knowledge of the covariance. For dis… ▽ More

    Submitted 5 July, 2024; v1 submitted 7 June, 2022; originally announced June 2022.

    Comments: Fixed minor oversight in runtime calculation

  40. arXiv:2204.12399  [pdf, other

    cs.DS cs.LG math.ST stat.ML

    Streaming Algorithms for High-Dimensional Robust Statistics

    Authors: Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia, Thanasis Pittas

    Abstract: We study high-dimensional robust statistics tasks in the streaming model. A recent line of work obtained computationally efficient algorithms for a range of high-dimensional robust estimation tasks. Unfortunately, all previous algorithms require storing the entire dataset, incurring memory at least quadratic in the dimension. In this work, we develop the first efficient streaming algorithms for hi… ▽ More

    Submitted 3 May, 2023; v1 submitted 26 April, 2022; originally announced April 2022.

  41. arXiv:2112.09104  [pdf, ps, other

    cs.DS cs.LG math.ST stat.ML

    Non-Gaussian Component Analysis via Lattice Basis Reduction

    Authors: Ilias Diakonikolas, Daniel M. Kane

    Abstract: Non-Gaussian Component Analysis (NGCA) is the following distribution learning problem: Given i.i.d. samples from a distribution on $\mathbb{R}^d$ that is non-gaussian in a hidden direction $v$ and an independent standard Gaussian in the orthogonal directions, the goal is to approximate the hidden direction $v$. Prior work \cite{DKS17-sq} provided formal evidence for the existence of an information… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

  42. arXiv:2109.11515  [pdf, other

    cs.LG cs.DS math.OC math.ST stat.ML

    Outlier-Robust Sparse Estimation via Non-Convex Optimization

    Authors: Yu Cheng, Ilias Diakonikolas, Rong Ge, Shivam Gupta, Daniel M. Kane, Mahdi Soltanolkotabi

    Abstract: We explore the connection between outlier-robust high-dimensional statistics and non-convex optimization in the presence of sparsity constraints, with a focus on the fundamental tasks of robust sparse mean estimation and robust sparse PCA. We develop novel and simple optimization formulations for these problems such that any approximate stationary point of the associated optimization problem yield… ▽ More

    Submitted 13 November, 2022; v1 submitted 23 September, 2021; originally announced September 2021.

    Comments: Accepted to Conference on Neural Information Processing Systems (NeurIPS) 2022. (Updated to the NeurIPS'22 version in v2.)

  43. arXiv:2108.08767  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    Learning General Halfspaces with General Massart Noise under the Gaussian Distribution

    Authors: Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis

    Abstract: We study the problem of PAC learning halfspaces on $\mathbb{R}^d$ with Massart noise under the Gaussian distribution. In the Massart model, an adversary is allowed to flip the label of each point $\mathbf{x}$ with unknown probability $η(\mathbf{x}) \leq η$, for some parameter $η\in [0,1/2]$. The goal is to find a hypothesis with misclassification error of $\mathrm{OPT} + ε$, where $\mathrm{OPT}$ i… ▽ More

    Submitted 8 November, 2021; v1 submitted 19 August, 2021; originally announced August 2021.

    Comments: Revised presentation

  44. arXiv:2106.09689  [pdf, ps, other

    cs.DS cs.LG math.ST stat.ML

    Statistical Query Lower Bounds for List-Decodable Linear Regression

    Authors: Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia, Thanasis Pittas, Alistair Stewart

    Abstract: We study the problem of list-decodable linear regression, where an adversary can corrupt a majority of the examples. Specifically, we are given a set $T$ of labeled examples $(x, y) \in \mathbb{R}^d \times \mathbb{R}$ and a parameter $0< α<1/2$ such that an $α$-fraction of the points in $T$ are i.i.d. samples from a linear regression model with Gaussian covariates, and the remaining $(1-α)$-fracti… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

  45. arXiv:2102.05629  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    Agnostic Proper Learning of Halfspaces under Gaussian Marginals

    Authors: Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis

    Abstract: We study the problem of agnostically learning halfspaces under the Gaussian distribution. Our main result is the {\em first proper} learning algorithm for this problem whose sample complexity and computational complexity qualitatively match those of the best known improper agnostic learner. Building on this result, we also obtain the first proper polynomial-time approximation scheme (PTAS) for agn… ▽ More

    Submitted 10 February, 2021; originally announced February 2021.

  46. arXiv:2102.04401  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    The Optimality of Polynomial Regression for Agnostic Learning under Gaussian Marginals

    Authors: Ilias Diakonikolas, Daniel M. Kane, Thanasis Pittas, Nikos Zarifis

    Abstract: We study the problem of agnostic learning under the Gaussian distribution. We develop a method for finding hard families of examples for a wide class of problems by using LP duality. For Boolean-valued concept classes, we show that the $L^1$-regression algorithm is essentially best possible, and therefore that the computational difficulty of agnostically learning a concept class is closely related… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.

  47. arXiv:2102.02171  [pdf, ps, other

    cs.LG cs.DS math.PR math.ST stat.ML

    Outlier-Robust Learning of Ising Models Under Dobrushin's Condition

    Authors: Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart, Yuxin Sun

    Abstract: We study the problem of learning Ising models satisfying Dobrushin's condition in the outlier-robust setting where a constant fraction of the samples are adversarially corrupted. Our main result is to provide the first computationally efficient robust learning algorithm for this problem with near-optimal error guarantees. Our algorithm can be seen as a special case of an algorithm for robustly lea… ▽ More

    Submitted 3 February, 2021; originally announced February 2021.

  48. arXiv:2012.15802  [pdf, ps, other

    cs.LG math.ST stat.ML

    The Sample Complexity of Robust Covariance Testing

    Authors: Ilias Diakonikolas, Daniel M. Kane

    Abstract: We study the problem of testing the covariance matrix of a high-dimensional Gaussian in a robust setting, where the input distribution has been corrupted in Huber's contamination model. Specifically, we are given i.i.d. samples from a distribution of the form $Z = (1-ε) X + εB$, where $X$ is a zero-mean and unknown covariance Gaussian $\mathcal{N}(0, Σ)$, $B$ is a fixed but unknown noise distribut… ▽ More

    Submitted 31 December, 2020; originally announced December 2020.

  49. arXiv:2012.09720  [pdf, ps, other

    cs.LG cs.CC math.ST stat.ML

    Near-Optimal Statistical Query Hardness of Learning Halfspaces with Massart Noise

    Authors: Ilias Diakonikolas, Daniel M. Kane

    Abstract: We study the problem of PAC learning halfspaces with Massart noise. Given labeled samples $(x, y)$ from a distribution $D$ on $\mathbb{R}^{d} \times \{ \pm 1\}$ such that the marginal $D_x$ on the examples is arbitrary and the label $y$ of example $x$ is generated from the target halfspace corrupted by a Massart adversary with flipping probability $η(x) \leq η\leq 1/2$, the goal is to compute a hy… ▽ More

    Submitted 8 November, 2021; v1 submitted 17 December, 2020; originally announced December 2020.

    Comments: This version improves on the previous version. It obtains a near-optimal hardness result essentially matching known algorithms

  50. arXiv:2012.07774  [pdf, ps, other

    cs.LG cs.CC cs.DS math.AG math.ST

    Small Covers for Near-Zero Sets of Polynomials and Learning Latent Variable Models

    Authors: Ilias Diakonikolas, Daniel M. Kane

    Abstract: Let $V$ be any vector space of multivariate degree-$d$ homogeneous polynomials with co-dimension at most $k$, and $S$ be the set of points where all polynomials in $V$ {\em nearly} vanish. We establish a qualitatively optimal upper bound on the size of $ε$-covers for $S$, in the $\ell_2$-norm. Roughly speaking, we show that there exists an $ε$-cover for $S$ of cardinality… ▽ More

    Submitted 14 December, 2020; originally announced December 2020.

    Comments: Full version of FOCS'20 paper