Skip to main content

Showing 1–31 of 31 results for author: Kontorovich, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2410.02835  [pdf, other

    math.ST cs.LG stat.ME

    The Empirical Mean is Minimax Optimal for Local Glivenko-Cantelli

    Authors: Doron Cohen, Aryeh Kontorovich, Roi Weiss

    Abstract: We revisit the recently introduced Local Glivenko-Cantelli setting, which studies distribution-dependent uniform convergence rates of the Empirical Mean Estimator (EME). In this work, we investigate generalizations of this setting where arbitrary estimators are allowed rather than just the EME. Can a strictly larger class of measures be learned? Can better risk decay rates be obtained? We provide… ▽ More

    Submitted 28 May, 2025; v1 submitted 2 October, 2024; originally announced October 2024.

  2. arXiv:2407.16642  [pdf, ps, other

    math.PR cs.LG math.ST stat.ML

    Sharp bounds on aggregate expert error

    Authors: Aryeh Kontorovich, Ariel Avital

    Abstract: We revisit the classic problem of aggregating binary advice from conditionally independent experts, also known as the Naive Bayes setting. Our quantity of interest is the error probability of the optimal decision rule. In the case of symmetric errors (sensitivity = specificity), reasonably tight bounds on the optimal error probability are known. In the general asymmetric case, we are not aware of… ▽ More

    Submitted 23 December, 2024; v1 submitted 23 July, 2024; originally announced July 2024.

  3. arXiv:2309.17016  [pdf, other

    cs.LG math.ST stat.ML

    Efficient Agnostic Learning with Average Smoothness

    Authors: Steve Hanneke, Aryeh Kontorovich, Guy Kornowski

    Abstract: We study distribution-free nonparametric regression following a notion of average smoothness initiated by Ashlagi et al. (2021), which measures the "effective" smoothness of a function with respect to an arbitrary unknown underlying distribution. While the recent work of Hanneke et al. (2023) established tight uniform convergence bounds for average-smooth functions in the realizable case and provi… ▽ More

    Submitted 13 February, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

    Comments: ALT 2024 camera ready version. arXiv admin note: text overlap with arXiv:2302.06005

  4. arXiv:2302.06005  [pdf, other

    cs.LG math.ST stat.ML

    Near-optimal learning with average Hölder smoothness

    Authors: Steve Hanneke, Aryeh Kontorovich, Guy Kornowski

    Abstract: We generalize the notion of average Lipschitz smoothness proposed by Ashlagi et al. (COLT 2021) by extending it to Hölder smoothness. This measure of the "effective smoothness" of a function is sensitive to the underlying distribution and can be dramatically smaller than its classic "worst-case" Hölder constant. We consider both the realizable and the agnostic (noisy) regression settings, proving… ▽ More

    Submitted 30 October, 2023; v1 submitted 12 February, 2023; originally announced February 2023.

    Comments: NeurIPS 2023 camera ready version

  5. arXiv:2212.04216  [pdf, ps, other

    cs.LG cs.CR stat.ML

    Differentially-Private Bayes Consistency

    Authors: Olivier Bousquet, Haim Kaplan, Aryeh Kontorovich, Yishay Mansour, Shay Moran, Menachem Sadigurschi, Uri Stemmer

    Abstract: We construct a universally Bayes consistent learning rule that satisfies differential privacy (DP). We first handle the setting of binary classification and then extend our rule to the more general setting of density estimation (with respect to the total variation metric). The existence of a universally consistent DP learner reveals a stark difference with the distribution-free PAC model. Indeed,… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

  6. arXiv:2202.03045  [pdf, ps, other

    cs.LG stat.ML

    Metric-valued regression

    Authors: Dan Tsir Cohen, Aryeh Kontorovich

    Abstract: We propose an efficient algorithm for learning mappings between two metric spaces, $\X$ and $\Y$. Our procedure is strongly Bayes-consistent whenever $\X$ and $\Y$ are topologically separable and $\Y$ is "bounded in expectation" (our term; the separability assumption can be somewhat weakened). At this level of generality, ours is the first such learnability result for unbounded loss in the agnosti… ▽ More

    Submitted 7 February, 2022; originally announced February 2022.

  7. arXiv:2111.11971  [pdf, ps, other

    math.ST cs.LG stat.ML

    Tree density estimation

    Authors: László Györfi, Aryeh Kontorovich, Roi Weiss

    Abstract: We study the problem of estimating the density $f(\boldsymbol x)$ of a random vector ${\boldsymbol X}$ in $\mathbb R^d$. For a spanning tree $T$ defined on the vertex set $\{1,\dots ,d\}$, the tree density $f_{T}$ is a product of bivariate conditional densities. An optimal spanning tree minimizes the Kullback-Leibler divergence between $f$ and $f_{T}$. From i.i.d. data we identify an optimal tree… ▽ More

    Submitted 21 September, 2022; v1 submitted 23 November, 2021; originally announced November 2021.

  8. arXiv:2110.04763  [pdf, ps, other

    math.FA cs.LG stat.ML

    Fat-Shattering Dimension of $k$-fold Aggregations

    Authors: Idan Attias, Aryeh Kontorovich

    Abstract: We provide estimates on the fat-shattering dimension of aggregation rules of real-valued function classes. The latter consists of all ways of choosing $k$ functions, one from each of the $k$ classes, and computing a pointwise function of them, such as the median, mean, and maximum. The bound is stated in terms of the fat-shattering dimensions of the component classes. For linear and affine functio… ▽ More

    Submitted 9 September, 2023; v1 submitted 10 October, 2021; originally announced October 2021.

  9. arXiv:2011.04586  [pdf, ps, other

    cs.LG stat.ML

    Stable Sample Compression Schemes: New Applications and an Optimal SVM Margin Bound

    Authors: Steve Hanneke, Aryeh Kontorovich

    Abstract: We analyze a family of supervised learning algorithms based on sample compression schemes that are stable, in the sense that removing points from the training set which were not selected for the compression set does not alter the resulting classifier. We use this technique to derive a variety of novel or improved data-dependent generalization bounds for several learning algorithms. In particular,… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

  10. arXiv:2003.13561  [pdf, other

    cs.LG math.PR stat.ML

    On Biased Random Walks, Corrupted Intervals, and Learning Under Adversarial Design

    Authors: Daniel Berend, Aryeh Kontorovich, Lev Reyzin, Thomas Robinson

    Abstract: We tackle some fundamental problems in probability theory on corrupted random processes on the integer line. We analyze when a biased random walk is expected to reach its bottommost point and when intervals of integer points can be detected under a natural model of noise. We apply these results to problems in learning thresholds and intervals under a new model for learning under adversarial design… ▽ More

    Submitted 30 March, 2020; originally announced March 2020.

    Comments: 18 pages

  11. arXiv:2002.01999  [pdf, other

    cs.LG stat.ML

    Nested Barycentric Coordinate System as an Explicit Feature Map

    Authors: Lee-Ad Gottlieb, Eran Kaufman, Aryeh Kontorovich, Gabriel Nivasch, Ofir Pele

    Abstract: We propose a new embedding method which is particularly well-suited for settings where the sample size greatly exceeds the ambient dimension. Our technique consists of partitioning the space into simplices and then embedding the data points into features corresponding to the simplices' barycentric coordinates. We then train a linear classifier in the rich feature space obtained from the simplices.… ▽ More

    Submitted 5 February, 2020; originally announced February 2020.

  12. arXiv:2002.01408  [pdf, other

    cs.LG stat.ML

    Apportioned Margin Approach for Cost Sensitive Large Margin Classifiers

    Authors: Lee-Ad Gottlieb, Eran Kaufman, Aryeh Kontorovich

    Abstract: We consider the problem of cost sensitive multiclass classification, where we would like to increase the sensitivity of an important class at the expense of a less important one. We adopt an {\em apportioned margin} framework to address this problem, which enables an efficient margin shift between classes that share the same boundary. The decision boundary between all pairs of classes divides the… ▽ More

    Submitted 4 February, 2020; originally announced February 2020.

  13. arXiv:1910.05270  [pdf, ps, other

    math.ST cs.LG stat.ML

    Fast and Bayes-consistent nearest neighbors

    Authors: Klim Efremenko, Aryeh Kontorovich, Moshe Noivirt

    Abstract: Research on nearest-neighbor methods tends to focus somewhat dichotomously either on the statistical or the computational aspects -- either on, say, Bayes consistency and rates of convergence or on techniques for speeding up the proximity search. This paper aims at bridging these realms: to reap the advantages of fast evaluation time while maintaining Bayes consistency, and further without sacrifi… ▽ More

    Submitted 15 April, 2020; v1 submitted 7 October, 2019; originally announced October 2019.

  14. arXiv:1906.09855  [pdf, other

    cs.LG math.ST stat.ML

    Universal Bayes consistency in metric spaces

    Authors: Steve Hanneke, Aryeh Kontorovich, Sivan Sabato, Roi Weiss

    Abstract: We extend a recently proposed 1-nearest-neighbor based multiclass learning algorithm and prove that our modification is universally strongly Bayes-consistent in all metric spaces admitting any such learner, making it an "optimistically universal" Bayes-consistent learner. This is the first learning algorithm known to enjoy this property; by comparison, the $k$-NN classifier and its variants are no… ▽ More

    Submitted 6 January, 2021; v1 submitted 24 June, 2019; originally announced June 2019.

    Comments: To appear in Annals of Statistics

    Journal ref: Annals of Statistics 2021, Vol. 49, No. 4, 2129-2150, August 2021

  15. arXiv:1905.11930  [pdf, other

    cs.LG stat.ML

    Efficient Kirszbraun Extension with Applications to Regression

    Authors: Hanan Zaichyk, Armin Biess, Aryeh Kontorovich, Yury Makarychev

    Abstract: We introduce a framework for performing regression between two Hilbert spaces. This is done based on Kirszbraun's extension theorem, to the best of our knowledge, the first application of this technique to supervised learning. We analyze the statistical and computational aspects of this method. We decompose this task into two stages: training (which corresponds operationally to smoothing/regulariz… ▽ More

    Submitted 8 March, 2022; v1 submitted 28 May, 2019; originally announced May 2019.

  16. arXiv:1902.01224  [pdf, other

    math.ST cs.LG stat.ML

    Estimating the Mixing Time of Ergodic Markov Chains

    Authors: Geoffrey Wolfer, Aryeh Kontorovich

    Abstract: We address the problem of estimating the mixing time $t_{\mathsf{mix}}$ of an arbitrary ergodic finite-state Markov chain from a single trajectory of length $m$. The reversible case was addressed by Hsu et al. [2019], who left the general case as an open problem. In the reversible case, the analysis is greatly facilitated by the fact that the Markov operator is self-adjoint, and Weyl's inequality… ▽ More

    Submitted 16 August, 2022; v1 submitted 1 February, 2019; originally announced February 2019.

    Comments: COLT'19 conference manuscript, with minor fixes

  17. arXiv:1902.00080  [pdf, ps, other

    stat.ML cs.LG math.ST

    Minimax Testing of Identity to a Reference Ergodic Markov Chain

    Authors: Geoffrey Wolfer, Aryeh Kontorovich

    Abstract: We exhibit an efficient procedure for testing, based on a single long state sequence, whether an unknown Markov chain is identical to or $\varepsilon$-far from a given reference chain. We obtain nearly matching (up to logarithmic factors) upper and lower sample complexity bounds for our notion of distance, which is based on total variation. Perhaps surprisingly, we discover that the sample complex… ▽ More

    Submitted 24 September, 2019; v1 submitted 31 January, 2019; originally announced February 2019.

    Comments: A previous version of this print contained a mistake in a proof. We have now fixed it

  18. arXiv:1810.02180  [pdf, other

    cs.LG stat.ML

    Improved Generalization Bounds for Adversarially Robust Learning

    Authors: Idan Attias, Aryeh Kontorovich, Yishay Mansour

    Abstract: We consider a model of robust learning in an adversarial environment. The learner gets uncorrupted training data with access to possible corruptions that may be affected by the adversary during testing. The learner's goal is to build a robust classifier, which will be tested on future adversarial examples. The adversary is limited to $k$ possible corruptions for each input. We model the learner-ad… ▽ More

    Submitted 1 July, 2022; v1 submitted 4 October, 2018; originally announced October 2018.

    Comments: JMLR camera ready

  19. arXiv:1810.01864  [pdf, other

    cs.LG cs.IT math.ST stat.ML

    Agnostic Sample Compression Schemes for Regression

    Authors: Idan Attias, Steve Hanneke, Aryeh Kontorovich, Menachem Sadigurschi

    Abstract: We obtain the first positive results for bounded sample compression in the agnostic regression setting with the $\ell_p$ loss, where $p\in [1,\infty]$. We construct a generic approximate sample compression scheme for real-valued function classes exhibiting exponential size in the fat-shattering dimension but independent of the sample size. Notably, for linear regression, an approximate compression… ▽ More

    Submitted 3 February, 2024; v1 submitted 3 October, 2018; originally announced October 2018.

    Comments: New results in this version: (1) Approximate agnostic sample compression scheme for function classes with finite fat-shattering dimension and the $\ell_p$ loss (section 3), (2) Near-optimal approximate compression for linear functions and the $\ell_p$ loss (section 4.1) The results in sections 4.2 and 4.3 appear in the previous version

  20. arXiv:1809.05014  [pdf, other

    stat.ML cs.LG math.ST

    Statistical Estimation of Ergodic Markov Chain Kernel over Discrete State Space

    Authors: Geoffrey Wolfer, Aryeh Kontorovich

    Abstract: We investigate the statistical complexity of estimating the parameters of a discrete-state Markov chain kernel from a single long sequence of state observations. In the finite case, we characterize (modulo logarithmic factors) the minimax sample complexity of estimation with respect to the operator infinity norm, while in the countably infinite case, we analyze the problem with respect to a natura… ▽ More

    Submitted 13 August, 2020; v1 submitted 13 September, 2018; originally announced September 2018.

    Comments: Journal version of the extended abstract (ALT'19), to appear in Bernoulli 2020+

  21. arXiv:1805.09719  [pdf, other

    cs.LG cs.CC cs.CG stat.ML

    Learning convex polyhedra with margin

    Authors: Lee-Ad Gottlieb, Eran Kaufman, Aryeh Kontorovich, Gabriel Nivasch

    Abstract: We present an improved algorithm for {\em quasi-properly} learning convex polyhedra in the realizable PAC setting from data with a margin. Our learning algorithm constructs a consistent polyhedron as an intersection of about $t \log t$ halfspaces with constant-size margins in time polynomial in $t$ (where $t$ is the number of halfspaces forming an optimal polyhedron). We also identify distinct gen… ▽ More

    Submitted 2 November, 2021; v1 submitted 24 May, 2018; originally announced May 2018.

  22. arXiv:1805.08254  [pdf, ps, other

    cs.LG stat.ML

    Sample Compression for Real-Valued Learners

    Authors: Steve Hanneke, Aryeh Kontorovich, Menachem Sadigurschi

    Abstract: We give an algorithmically efficient version of the learner-to-compression scheme conversion in Moran and Yehudayoff (2016). In extending this technique to real-valued hypotheses, we also obtain an efficient regression-to-bounded sample compression converter. To our knowledge, this is the first general compressed regression result (regardless of efficiency or boundedness) guaranteeing uniform appr… ▽ More

    Submitted 21 May, 2018; originally announced May 2018.

  23. arXiv:1805.08140  [pdf, ps, other

    cs.LG math.ST stat.ML

    A New Lower Bound for Agnostic Learning with Sample Compression Schemes

    Authors: Steve Hanneke, Aryeh Kontorovich

    Abstract: We establish a tight characterization of the worst-case rates for the excess risk of agnostic learning with sample compression schemes and for uniform convergence for agnostic sample compression schemes. In particular, we find that the optimal rates of convergence for size-$k$ agnostic sample compression schemes are of the form $\sqrt{\frac{k \log(n/k)}{n}}$, which contrasts with agnostic learning… ▽ More

    Submitted 21 May, 2018; originally announced May 2018.

  24. arXiv:1708.07367  [pdf, ps, other

    math.ST cs.LG math.PR stat.ML

    Mixing time estimation in reversible Markov chains from a single sample path

    Authors: Daniel Hsu, Aryeh Kontorovich, David A. Levin, Yuval Peres, Csaba Szepesvári

    Abstract: The spectral gap $γ$ of a finite, ergodic, and reversible Markov chain is an important parameter measuring the asymptotic rate of convergence. In applications, the transition matrix $P$ may be unknown, yet one sample of the chain up to a fixed time $n$ may be observed. We consider here the problem of estimating $γ$ from this data. Let $π$ be the stationary distribution of $P$, and… ▽ More

    Submitted 24 August, 2017; originally announced August 2017.

    Comments: 34 pages, merges results of arXiv:1506.02903 and arXiv:1612.05330

  25. arXiv:1506.02903  [pdf, ps, other

    cs.LG stat.ML

    Mixing Time Estimation in Reversible Markov Chains from a Single Sample Path

    Authors: Daniel Hsu, Aryeh Kontorovich, Csaba Szepesvári

    Abstract: This article provides the first procedure for computing a fully data-dependent interval that traps the mixing time $t_{\text{mix}}$ of a finite reversible ergodic Markov chain at a prescribed confidence level. The interval is computed from a single finite-length sample path from the Markov chain, and does not require the knowledge of any parameters of the chain. This stands in contrast to previous… ▽ More

    Submitted 2 November, 2015; v1 submitted 9 June, 2015; originally announced June 2015.

    Comments: 28 pages; minor clarification in Appendix A concerning lower bounds

  26. arXiv:1407.0208  [pdf, ps, other

    cs.LG stat.ML

    A Bayes consistent 1-NN classifier

    Authors: Aryeh Kontorovich, Roi Weiss

    Abstract: We show that a simple modification of the 1-nearest neighbor classifier yields a strongly Bayes consistent learner. Prior to this work, the only strongly Bayes consistent proximity-based method was the k-nearest neighbor classifier, for k growing appropriately with sample size. We will argue that a margin-regularized 1-NN enjoys considerable statistical and algorithmic advantages over the k-NN cla… ▽ More

    Submitted 17 August, 2018; v1 submitted 1 July, 2014; originally announced July 2014.

  27. arXiv:1312.0451  [pdf, ps, other

    math.PR cs.LG stat.ML

    Consistency of weighted majority votes

    Authors: Daniel Berend, Aryeh Kontorovich

    Abstract: We revisit the classical decision-theoretic problem of weighted expert voting from a statistical learning perspective. In particular, we examine the consistency (both asymptotic and finitary) of the optimal Nitzan-Paroush weighted majority and related rules. In the case of known expert competence levels, we give sharp error estimates for the optimal rule. When the competence levels are unknown, th… ▽ More

    Submitted 21 January, 2014; v1 submitted 2 December, 2013; originally announced December 2013.

    MSC Class: 60C05; 60F15

  28. arXiv:1309.4859  [pdf, ps, other

    stat.ML

    Predictive PAC Learning and Process Decompositions

    Authors: Cosma Rohilla Shalizi, Aryeh Kontorovich

    Abstract: We informally call a stochastic process learnable if it admits a generalization error approaching zero in probability for any concept class with finite VC-dimension (IID processes are the simplest example). A mixture of learnable processes need not be learnable itself, and certainly its generalization error need not decay at the same rate. In this paper, we argue that it is natural in predictive P… ▽ More

    Submitted 19 September, 2013; originally announced September 2013.

    Comments: 9 pages, accepted in NIPS 2013

    Journal ref: Advances in Neural Information Processing Systems 26 [NIPS 2013], pp.1619--1627

  29. arXiv:1306.2547  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Efficient Classification for Metric Data

    Authors: Lee-Ad Gottlieb, Aryeh Kontorovich, Robert Krauthgamer

    Abstract: Recent advances in large-margin classification of data residing in general metric spaces (rather than Hilbert spaces) enable classification under various natural metrics, such as string edit and earthmover distance. A general framework developed for this purpose by von Luxburg and Bousquet [JMLR, 2004] left open the questions of computational efficiency and of providing direct bounds on generaliza… ▽ More

    Submitted 10 July, 2014; v1 submitted 11 June, 2013; originally announced June 2013.

    Comments: This is the full version of an extended abstract that appeared in Proceedings of the 23rd COLT, 2010

  30. arXiv:1302.6009  [pdf, ps, other

    cs.LG math.ST stat.ML

    On learning parametric-output HMMs

    Authors: Aryeh Kontorovich, Boaz Nadler, Roi Weiss

    Abstract: We present a novel approach for learning an HMM whose outputs are distributed according to a parametric family. This is done by {\em decoupling} the learning task into two steps: first estimating the output parameters, and then estimating the hidden states transition probabilities. The first step is accomplished by fitting a mixture model to the output stationary distribution. Given the parameters… ▽ More

    Submitted 25 February, 2013; originally announced February 2013.

  31. arXiv:1302.2752  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Adaptive Metric Dimensionality Reduction

    Authors: Lee-Ad Gottlieb, Aryeh Kontorovich, Robert Krauthgamer

    Abstract: We study adaptive data-dependent dimensionality reduction in the context of supervised learning in general metric spaces. Our main statistical contribution is a generalization bound for Lipschitz functions in metric spaces that are doubling, or nearly doubling. On the algorithmic front, we describe an analogue of PCA for metric spaces: namely an efficient procedure that approximates the data's int… ▽ More

    Submitted 25 March, 2015; v1 submitted 12 February, 2013; originally announced February 2013.