Skip to main content

Showing 1–28 of 28 results for author: Neykov, M

.
  1. arXiv:2503.10794  [pdf, ps, other

    math.ST

    Nonparametric Exponential Family Regression Under Star-Shaped Constraints

    Authors: Guanghong Yi, Matey Neykov

    Abstract: We study the minimax rate of estimation in nonparametric exponential family regression under star-shaped constraints. Specifically, the parameter space $K$ is a star-shaped set contained within a bounded box $[-M, M]^n$, where $M$ is a known positive constant. Moreover, we assume that the exponential family is nonsingular and that its cumulant function is twice continuously differentiable. Our mai… ▽ More

    Submitted 13 March, 2025; originally announced March 2025.

    Comments: 24 pages, 0 figures

  2. arXiv:2501.10025  [pdf, ps, other

    math.ST stat.ML

    Robust density estimation over star-shaped density classes

    Authors: Xiaolong Liu, Matey Neykov

    Abstract: We establish a novel criterion for comparing the performance of two densities, $g_1$ and $g_2$, within the context of corrupted data. Utilizing this criterion, we propose an algorithm to construct a density estimator within a star-shaped density class, $\mathcal{F}$, under conditions of data corruption. We proceed to derive the minimax upper and lower bounds for density estimation across this star… ▽ More

    Submitted 17 January, 2025; originally announced January 2025.

  3. arXiv:2412.03832  [pdf, ps, other

    math.ST

    Information theoretic limits of robust sub-Gaussian mean estimation under star-shaped constraints

    Authors: Akshay Prasadan, Matey Neykov

    Abstract: We obtain the minimax rate for a mean location model with a bounded star-shaped set $K \subseteq \mathbb{R}^n$ constraint on the mean, in an adversarially corrupted data setting with Gaussian noise. We assume an unknown fraction $ε\le 1/2-κ$ for some fixed $κ\in(0,1/2]$ of $N$ observations are arbitrarily corrupted. We obtain a minimax risk up to proportionality constants under the squared… ▽ More

    Submitted 12 June, 2025; v1 submitted 4 December, 2024; originally announced December 2024.

  4. arXiv:2406.05911  [pdf, ps, other

    math.ST

    Some facts about the optimality of the LSE in the Gaussian sequence model with convex constraint

    Authors: Akshay Prasadan, Matey Neykov

    Abstract: We consider a convex constrained Gaussian sequence model and characterize necessary and sufficient conditions for the least squares estimator (LSE) to be minimax optimal. For a closed convex set $K\subset \mathbb{R}^n$ we observe $Y=μ+ξ$ for $ξ\sim \mathcal{N}(0,σ^2\mathbb{I}_n)$ and $μ\in K$ and aim to estimate $μ$. We characterize the worst case risk of the LSE in multiple ways by analyzing the… ▽ More

    Submitted 3 February, 2025; v1 submitted 9 June, 2024; originally announced June 2024.

  5. arXiv:2402.18921  [pdf, other

    math.ST stat.ME stat.ML

    Semi-Supervised U-statistics

    Authors: Ilmun Kim, Larry Wasserman, Sivaraman Balakrishnan, Matey Neykov

    Abstract: Semi-supervised datasets are ubiquitous across diverse domains where obtaining fully labeled data is costly or time-consuming. The prevalence of such datasets has consistently driven the demand for new tools and methods that exploit the potential of unlabeled data. Responding to this demand, we introduce semi-supervised U-statistics enhanced by the abundance of unlabeled data, and investigate thei… ▽ More

    Submitted 9 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

  6. arXiv:2401.07968  [pdf, ps, other

    math.ST

    Characterizing the minimax rate of nonparametric regression under bounded star-shaped constraints

    Authors: Akshay Prasadan, Matey Neykov

    Abstract: We quantify the minimax rate for a nonparametric regression model over a star-shaped function class $\mathcal{F}$ with bounded diameter. We obtain a minimax rate of ${\varepsilon^{\ast}}^2\wedge\mathrm{diam}(\mathcal{F})^2$ where \[\varepsilon^{\ast} =\sup\{\varepsilon>0:n\varepsilon^2 \le \log M_{\mathcal{F}}^{\operatorname{loc}}(\varepsilon,c)\},\] where… ▽ More

    Submitted 9 January, 2025; v1 submitted 15 January, 2024; originally announced January 2024.

  7. arXiv:2308.13036  [pdf, other

    math.ST

    Signal Detection with Quadratically Convex Orthosymmetric Constraints

    Authors: Matey Neykov

    Abstract: This paper is concerned with signal detection in Gaussian noise under quadratically convex orthosymmetric (QCO) constraints. Specifically the null hypothesis assumes no signal, whereas the alternative considers signal which is separated in Euclidean norm from zero, and belongs to the QCO constraint. Our main result establishes the minimax rate of the separation radius between the null and alternat… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

    Comments: 2 figures

  8. arXiv:2308.08672  [pdf, other

    math.ST

    Nearly Minimax Optimal Wasserstein Conditional Independence Testing

    Authors: Matey Neykov, Larry Wasserman, Ilmun Kim, Sivaraman Balakrishnan

    Abstract: This paper is concerned with minimax conditional independence testing. In contrast to some previous works on the topic, which use the total variation distance to separate the null from the alternative, here we use the Wasserstein distance. In addition, we impose Wasserstein smoothness conditions which on bounded domains are weaker than the corresponding total variation smoothness imposed, for inst… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

    Comments: 24 pages, 1 figure, ordering of the last three authors is random

  9. arXiv:2308.05373  [pdf, other

    math.ST stat.CO stat.ME

    Conditional Independence Testing for Discrete Distributions: Beyond $χ^2$- and $G$-tests

    Authors: Ilmun Kim, Matey Neykov, Sivaraman Balakrishnan, Larry Wasserman

    Abstract: This paper is concerned with the problem of conditional independence testing for discrete data. In recent years, researchers have shed new light on this fundamental problem, emphasizing finite-sample optimality. The non-asymptotic viewpoint adapted in these works has led to novel conditional independence tests that enjoy certain optimality under various regimes. Despite their attractive theoretica… ▽ More

    Submitted 28 October, 2023; v1 submitted 10 August, 2023; originally announced August 2023.

  10. arXiv:2210.11436  [pdf, other

    math.ST stat.ML

    Revisiting Le Cam's Equation: Exact Minimax Rates over Convex Density Classes

    Authors: Shamindra Shrotriya, Matey Neykov

    Abstract: We study the classical problem of deriving minimax rates for density estimation over convex density classes. Building on the pioneering work of Le Cam (1973), Birge (1983, 1986), Wong and Shen (1995), Yang and Barron (1999), we determine the exact (up to constants) minimax rate over any convex density class. This work thus extends these known results by demonstrating that the local metric entropy… ▽ More

    Submitted 23 October, 2023; v1 submitted 20 October, 2022; originally announced October 2022.

    Comments: Total paper (46 pages, 2 figures): Main paper (17 pages, 2 figures) + Appendix (29 pages). Updated to include proof of adaptivity of estimator

  11. arXiv:2207.07075  [pdf, other

    math.ST stat.ML

    Adversarial Sign-Corrupted Isotonic Regression

    Authors: Shamindra Shrotriya, Matey Neykov

    Abstract: Classical univariate isotonic regression involves nonparametric estimation under a monotonicity constraint of the true signal. We consider a variation of this generating process, which we term adversarial sign-corrupted isotonic (\texttt{ASCI}) regression. Under this \texttt{ASCI} setting, the adversary has full access to the true isotonic responses, and is free to sign-corrupt them. Estimating th… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

    Comments: Total paper (52 pages, 2 figures): Main paper (13 pages, 2 figures) + Appendix (39 pages)

  12. arXiv:2201.07329  [pdf, ps, other

    math.ST

    On the minimax rate of the Gaussian sequence model under bounded convex constraints

    Authors: Matey Neykov

    Abstract: We determine the exact minimax rate of a Gaussian sequence model under bounded convex constraints, purely in terms of the local geometry of the given constraint set $K$. Our main result shows that the minimax risk (up to constant factors) under the squared $\ell_2$ loss is given by $ε^{*2} \wedge \operatorname{diam}(K)^2$ with \begin{align*} ε^* = \sup \bigg\{ε: \frac{ε^2}{σ^2} \leq \log M^{\ope… ▽ More

    Submitted 7 November, 2022; v1 submitted 18 January, 2022; originally announced January 2022.

    Comments: 34 pages; 1 picture; v5: fixed an error in example section 3.3.2

  13. arXiv:2112.11666  [pdf, other

    math.ST stat.ME

    Local permutation tests for conditional independence

    Authors: Ilmun Kim, Matey Neykov, Sivaraman Balakrishnan, Larry Wasserman

    Abstract: In this paper, we investigate local permutation tests for testing conditional independence between two random vectors $X$ and $Y$ given $Z$. The local permutation test determines the significance of a test statistic by locally shuffling samples which share similar values of the conditioning variables $Z$, and it forms a natural extension of the usual permutation approach for unconditional independ… ▽ More

    Submitted 6 January, 2022; v1 submitted 21 December, 2021; originally announced December 2021.

    Comments: A few important references (missed before) added

  14. arXiv:2108.07630  [pdf, other

    math.ST

    Non-Asymptotic Bounds for the $\ell_{\infty}$ Estimator in Linear Regression with Uniform Noise

    Authors: Yufei Yi, Matey Neykov

    Abstract: The Chebyshev or $\ell_{\infty}$ estimator is an unconventional alternative to the ordinary least squares in solving linear regressions. It is defined as the minimizer of the $\ell_{\infty}$ objective function \begin{align*} \hat{\boldsymbolβ} := \arg\min_{\boldsymbolβ} \|\boldsymbol{Y} - \mathbf{X}\boldsymbolβ\|_{\infty}. \end{align*} The asymptotic distribution of the Chebyshev estimator under… ▽ More

    Submitted 14 March, 2023; v1 submitted 17 August, 2021; originally announced August 2021.

    Comments: 41 pages, 1 figure, 1 table; v3 included an optimal estimator for Gaussian design + a lower bound for the Chebyshev estimator in the Gaussian design setting; to appear in Bernoulli

  15. arXiv:2104.03464  [pdf, ps, other

    stat.ME math.ST

    A New Perspective on Debiasing Linear Regressions

    Authors: Yufei Yi, Matey Neykov

    Abstract: In this paper, we propose an abstract procedure for debiasing constrained or regularized potentially high-dimensional linear models. It is elementary to show that the proposed procedure can produce $\frac{1}{\sqrt{n}}$-confidence intervals for individual coordinates (or even bounded contrasts) in models with unknown covariance, provided that the covariance has bounded spectrum. While the proof of… ▽ More

    Submitted 11 January, 2023; v1 submitted 7 April, 2021; originally announced April 2021.

    MSC Class: Primary 62F30; 62J05; 62J07

  16. arXiv:2103.07095  [pdf, ps, other

    math.ST

    Minimax Optimal Conditional Density Estimation under Total Variation Smoothness

    Authors: Michael Li, Matey Neykov, Sivaraman Balakrishnan

    Abstract: This paper studies the minimax rate of nonparametric conditional density estimation under a weighted absolute value loss function in a multivariate setting. We first demonstrate that conditional density estimation is impossible if one only requires that $p_{X|Z}$ is smooth in $x$ for all values of $z$. This motivates us to consider a sub-class of absolutely continuous distributions, restricting th… ▽ More

    Submitted 12 March, 2021; originally announced March 2021.

    Comments: 42 pages, 0 figures

  17. arXiv:2005.07587  [pdf, ps, other

    math.ST stat.CO stat.ME stat.ML

    Non-Sparse PCA in High Dimensions via Cone Projected Power Iteration

    Authors: Yufei Yi, Matey Neykov

    Abstract: In this paper, we propose a cone projected power iteration algorithm to recover the first principal eigenvector from a noisy positive semidefinite matrix. When the true principal eigenvector is assumed to belong to a convex cone, the proposed algorithm is fast and has a tractable error. Specifically, the method achieves polynomial time complexity for certain convex cones equipped with fast project… ▽ More

    Submitted 28 February, 2021; v1 submitted 15 May, 2020; originally announced May 2020.

  18. arXiv:2003.11744  [pdf, other

    stat.ME

    Prior Adaptive Semi-supervised Learning with Application to EHR Phenotyping

    Authors: Yichi Zhang, Molei Liu, Matey Neykov, Tianxi Cai

    Abstract: Electronic Health Records (EHR) data, a rich source for biomedical research, have been successfully used to gain novel insight into a wide range of diseases. Despite its potential, EHR is currently underutilized for discovery research due to it's major limitation in the lack of precise phenotype information. To overcome such difficulties, recent efforts have been devoted to developing supervised a… ▽ More

    Submitted 12 September, 2021; v1 submitted 26 March, 2020; originally announced March 2020.

  19. arXiv:2001.03039  [pdf, other

    math.ST

    Minimax Optimal Conditional Independence Testing

    Authors: Matey Neykov, Sivaraman Balakrishnan, Larry Wasserman

    Abstract: We consider the problem of conditional independence testing of $X$ and $Y$ given $Z$ where $X,Y$ and $Z$ are three real random variables and $Z$ is continuous. We focus on two main cases - when $X$ and $Y$ are both discrete, and when $X$ and $Y$ are both continuous. In view of recent results on conditional independence testing (Shah and Peters, 2018), one cannot hope to design non-trivial tests, w… ▽ More

    Submitted 1 July, 2021; v1 submitted 9 January, 2020; originally announced January 2020.

    Comments: 92 pages, 1 table, 6 figures. v4 major updates: fixed and error in appendix G -- multivariate Z case

  20. High-Temperature Structure Detection in Ferromagnets

    Authors: Yuan Cao, Matey Neykov, Han Liu

    Abstract: This paper studies structure detection problems in high temperature ferromagnetic (positive interaction only) Ising models. The goal is to distinguish whether the underlying graph is empty, i.e., the model consists of independent Rademacher variables, versus the alternative that the underlying graph contains a subgraph of a certain structure. We give matching upper and lower minimax bounds under w… ▽ More

    Submitted 12 January, 2021; v1 submitted 21 September, 2018; originally announced September 2018.

    Comments: 51 pages, 4 figures. version 2: a new computational lower bound result is added. version 3: citations are updated

    Journal ref: Information and Inference: A Journal of the IMA (2020)

  21. arXiv:1712.06245  [pdf, other

    stat.ML cs.LG math.OC

    Misspecified Nonconvex Statistical Optimization for Phase Retrieval

    Authors: Zhuoran Yang, Lin F. Yang, Ethan X. Fang, Tuo Zhao, Zhaoran Wang, Matey Neykov

    Abstract: Existing nonconvex statistical optimization theory and methods crucially rely on the correct specification of the underlying "true" statistical models. To address this issue, we take a first step towards taming model misspecification by studying the high-dimensional sparse phase retrieval problem with misspecified link functions. In particular, we propose a simple variant of the thresholded Wirtin… ▽ More

    Submitted 17 December, 2017; originally announced December 2017.

    Comments: 56 pages

  22. arXiv:1709.06688  [pdf, other

    math.ST stat.ML

    Property Testing in High Dimensional Ising models

    Authors: Matey Neykov, Han Liu

    Abstract: This paper explores the information-theoretic limitations of graph property testing in zero-field Ising models. Instead of learning the entire graph structure, sometimes testing a basic graph property such as connectivity, cycle presence or maximum clique size is a more relevant and attainable objective. Since property testing is more fundamental than graph recovery, any necessary conditions for p… ▽ More

    Submitted 30 July, 2018; v1 submitted 19 September, 2017; originally announced September 2017.

    Comments: 72 pages, 10 figures; revised version

  23. arXiv:1707.09114  [pdf, other

    math.ST stat.ML

    Adaptive Inferential Method for Monotone Graph Invariants

    Authors: Junwei Lu, Matey Neykov, Han Liu

    Abstract: We consider the problem of undirected graphical model inference. In many applications, instead of perfectly recovering the unknown graph structure, a more realistic goal is to infer some graph invariants (e.g., the maximum degree, the number of connected subgraphs, the number of isolated nodes). In this paper, we propose a new inferential framework for testing nested multiple hypotheses and constr… ▽ More

    Submitted 28 July, 2017; originally announced July 2017.

  24. arXiv:1701.05230  [pdf, other

    stat.ME math.ST stat.ML

    Surrogate Aided Unsupervised Recovery of Sparse Signals in Single Index Models for Binary Outcomes

    Authors: Abhishek Chakrabortty, Matey Neykov, Raymond Carroll, Tianxi Cai

    Abstract: We consider the recovery of regression coefficients, denoted by $\boldsymbolβ_0$, for a single index model (SIM) relating a binary outcome $Y$ to a set of possibly high dimensional covariates $\boldsymbol{X}$, based on a large but 'unlabeled' dataset $\mathcal{U}$, with $Y$ never observed. On $\mathcal{U}$, we fully observe $\boldsymbol{X}$ and additionally, a surrogate $S$ which, while not being… ▽ More

    Submitted 30 June, 2018; v1 submitted 18 January, 2017; originally announced January 2017.

    Comments: 50 pages, 3 tables, 1 figure

    MSC Class: 62J12; 62J07; 62H30; 62G32; 62F10; 62F30

  25. arXiv:1608.03045  [pdf, other

    math.ST stat.ML

    Combinatorial Inference for Graphical Models

    Authors: Matey Neykov, Junwei Lu, Han Liu

    Abstract: We propose a new family of combinatorial inference problems for graphical models. Unlike classical statistical inference where the main interest is point estimation or parameter testing, combinatorial inference aims at testing the global structure of the underlying graph. Examples include testing the graph connectivity, the presence of a cycle of certain size, or the maximum degree of the graph. T… ▽ More

    Submitted 12 February, 2018; v1 submitted 10 August, 2016; originally announced August 2016.

    Comments: 78 pages, 18 figures, 2 tables; to appear in the Annals of Statistics

  26. arXiv:1511.08102  [pdf, other

    math.ST stat.ML

    L1-Regularized Least Squares for Support Recovery of High Dimensional Single Index Models with Gaussian Designs

    Authors: Matey Neykov, Jun S. Liu, Tianxi Cai

    Abstract: It is known that for a certain class of single index models (SIMs) $Y = f(\boldsymbol{X}_{p \times 1}^\intercal\boldsymbolβ_0, \varepsilon)$, support recovery is impossible when $\boldsymbol{X} \sim \mathcal{N}(0, \mathbb{I}_{p \times p})$ and a model complexity adjusted sample size is below a critical threshold. Recently, optimal algorithms based on Sliced Inverse Regression (SIR) were suggested.… ▽ More

    Submitted 22 June, 2016; v1 submitted 25 November, 2015; originally announced November 2015.

    Comments: 36 pages; 6 figures; typos corrected; clearer notation introduced

  27. arXiv:1511.02270  [pdf, other

    math.ST stat.ML

    Signed Support Recovery for Single Index Models in High-Dimensions

    Authors: Matey Neykov, Qian Lin, Jun S. Liu

    Abstract: In this paper we study the support recovery problem for single index models $Y=f(\boldsymbol{X}^{\intercal} \boldsymbolβ,\varepsilon)$, where $f$ is an unknown link function, $\boldsymbol{X}\sim N_p(0,\mathbb{I}_{p})$ and $\boldsymbolβ$ is an $s$-sparse unit vector such that $\boldsymbolβ_{i}\in \{\pm\frac{1}{\sqrt{s}},0\}$. In particular, we look into the performance of two computationally inexpe… ▽ More

    Submitted 22 June, 2016; v1 submitted 6 November, 2015; originally announced November 2015.

    Comments: 38 pages, 7 figures; 1 table; data set analysis added; typos corrected

  28. arXiv:1510.08986  [pdf, other

    math.ST stat.ME stat.ML

    A Unified Theory of Confidence Regions and Testing for High Dimensional Estimating Equations

    Authors: Matey Neykov, Yang Ning, Jun S. Liu, Han Liu

    Abstract: We propose a new inferential framework for constructing confidence regions and testing hypotheses in statistical models specified by a system of high dimensional estimating equations. We construct an influence function by projecting the fitted estimating equations to a sparse direction obtained by solving a large-scale linear program. Our main theoretical contribution is to establish a unified Z-e… ▽ More

    Submitted 22 June, 2016; v1 submitted 30 October, 2015; originally announced October 2015.

    Comments: 67 pages, 2 tables, 1 figure