Skip to main content

Showing 1–18 of 18 results for author: Tsybakov, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2503.02131  [pdf, ps, other

    stat.ML cs.LG

    Gradient-free stochastic optimization for additive models

    Authors: Arya Akhavan, Alexandre B. Tsybakov

    Abstract: We address the problem of zero-order optimization from noisy observations for an objective function satisfying the Polyak-Łojasiewicz or the strong convexity condition. Additionally, we assume that the objective function has an additive structure and satisfies a higher-order smoothness property, characterized by the Hölder family of functions. The additive model for Hölder classes of functions is… ▽ More

    Submitted 5 April, 2025; v1 submitted 3 March, 2025; originally announced March 2025.

  2. arXiv:2406.05714  [pdf, ps, other

    stat.ML cs.LG math.ST

    A conversion theorem and minimax optimality for continuum contextual bandits

    Authors: Arya Akhavan, Karim Lounici, Massimiliano Pontil, Alexandre B. Tsybakov

    Abstract: We study the contextual continuum bandits problem, where the learner sequentially receives a side information vector and has to choose an action in a convex set, minimizing a function associated with the context. The goal is to minimize all the underlying functions for the received contexts, leading to the contextual notion of regret, which is stronger than the standard static regret. Assuming tha… ▽ More

    Submitted 17 April, 2025; v1 submitted 9 June, 2024; originally announced June 2024.

  3. arXiv:2306.02159  [pdf, ps, other

    math.ST stat.ML

    Gradient-free optimization of highly smooth functions: improved analysis and a new algorithm

    Authors: Arya Akhavan, Evgenii Chzhen, Massimiliano Pontil, Alexandre B. Tsybakov

    Abstract: This work studies minimization problems with zero-order noisy oracle information under the assumption that the objective function is highly smooth and possibly satisfies additional properties. We consider two kinds of zero-order projected gradient descent algorithms, which differ in the form of the gradient estimator. The first algorithm uses a gradient estimator based on randomization over the… ▽ More

    Submitted 3 June, 2023; originally announced June 2023.

  4. arXiv:2211.16457  [pdf, ps, other

    math.ST stat.ML

    Estimating the minimizer and the minimum value of a regression function under passive design

    Authors: Arya Akhavan, Davit Gogolashvili, Alexandre B. Tsybakov

    Abstract: We propose a new method for estimating the minimizer $\boldsymbol{x}^*$ and the minimum value $f^*$ of a smooth and strongly convex regression function $f$ from the observations contaminated by random noise. Our estimator $\boldsymbol{z}_n$ of the minimizer $\boldsymbol{x}^*$ is based on a version of the projected gradient descent with the gradient estimated by a regularized local polynomial algor… ▽ More

    Submitted 8 October, 2023; v1 submitted 29 November, 2022; originally announced November 2022.

    Comments: 35 pages

    MSC Class: 62G05; 90C25

  5. arXiv:2206.13347  [pdf, other

    math.ST cs.LG stat.ML

    Benign overfitting and adaptive nonparametric regression

    Authors: Julien Chhor, Suzanne Sigalla, Alexandre B. Tsybakov

    Abstract: In the nonparametric regression setting, we construct an estimator which is a continuous function interpolating the data points with high probability, while attaining minimax optimal rates under mean squared risk on the scale of Hölder classes adaptively to the unknown smoothness.

    Submitted 27 June, 2022; originally announced June 2022.

  6. arXiv:2205.13910  [pdf, other

    math.ST stat.ML

    A gradient estimator via L1-randomization for online zero-order optimization with two point feedback

    Authors: Arya Akhavan, Evgenii Chzhen, Massimiliano Pontil, Alexandre B. Tsybakov

    Abstract: This work studies online zero-order optimization of convex and Lipschitz functions. We present a novel gradient estimator based on two function evaluations and randomization on the $\ell_1$-sphere. Considering different geometries of feasible sets and Lipschitz assumptions we analyse online dual averaging algorithm with our estimator in place of the usual gradient. We consider two types of assumpt… ▽ More

    Submitted 20 September, 2022; v1 submitted 27 May, 2022; originally announced May 2022.

  7. arXiv:2006.07862  [pdf, ps, other

    cs.LG math.OC stat.ML

    Exploiting Higher Order Smoothness in Derivative-free Optimization and Continuous Bandits

    Authors: Arya Akhavan, Massimiliano Pontil, Alexandre B. Tsybakov

    Abstract: We study the problem of zero-order optimization of a strongly convex function. The goal is to find the minimizer of the function by a sequential exploration of its values, under measurement noise. We study the impact of higher order smoothness properties of the function on the optimization error and on the cumulative regret. To solve this problem we consider a randomized approximation of the proje… ▽ More

    Submitted 24 November, 2022; v1 submitted 14 June, 2020; originally announced June 2020.

  8. arXiv:2005.12225  [pdf, other

    econ.EM stat.ME

    An alternative to synthetic control for models with many covariates under sparsity

    Authors: Marianne Bléhaut, Xavier D'Haultfoeuille, Jérémy L'Hour, Alexandre B. Tsybakov

    Abstract: The synthetic control method is a an econometric tool to evaluate causal effects when only one unit is treated. While initially aimed at evaluating the effect of large-scale macroeconomic changes with very few available control units, it has increasingly been used in place of more well-known microeconometric tools in a broad range of applications, but its properties in this context are unknown. Th… ▽ More

    Submitted 20 June, 2021; v1 submitted 25 May, 2020; originally announced May 2020.

    Comments: 39 pages, 3 figures

  9. arXiv:1806.09471  [pdf, other

    stat.ML cs.LG math.ST

    Does data interpolation contradict statistical optimality?

    Authors: Mikhail Belkin, Alexander Rakhlin, Alexandre B. Tsybakov

    Abstract: We show that learning methods interpolating the training data can achieve optimal rates for the problems of nonparametric regression and prediction with square loss.

    Submitted 25 June, 2018; originally announced June 2018.

  10. arXiv:1710.10870  [pdf, other

    math.ST stat.ME

    Sparse covariance matrix estimation in high-dimensional deconvolution

    Authors: Denis Belomestny, Mathias Trabs, Alexandre B. Tsybakov

    Abstract: We study the estimation of the covariance matrix $Σ$ of a $p$-dimensional normal random vector based on $n$ independent observations corrupted by additive noise. Only a general nonparametric assumption is imposed on the distribution of the noise without any sparsity constraint on its covariance matrix. In this high-dimensional semiparametric deconvolution problem, we propose spectral thresholding… ▽ More

    Submitted 26 March, 2018; v1 submitted 30 October, 2017; originally announced October 2017.

    MSC Class: Primary 62H12; secondary 62F12; 62G05

  11. arXiv:1412.7216  [pdf, ps, other

    math.ST stat.ML

    An $\{l_1,l_2,l_{\infty}\}$-Regularization Approach to High-Dimensional Errors-in-variables Models

    Authors: Alexandre Belloni, Mathieu Rosenbaum, Alexandre B. Tsybakov

    Abstract: Several new estimation methods have been recently proposed for the linear regression model with observation error in the design. Different assumptions on the data generating process have motivated different estimators and analysis. In particular, the literature considered (1) observation errors in the design uniformly bounded by some $\bar δ$, and (2) zero mean independent observation errors. Unde… ▽ More

    Submitted 22 December, 2014; originally announced December 2014.

  12. arXiv:1408.0241  [pdf, ps, other

    math.ST stat.CO

    Linear and Conic Programming Estimators in High-Dimensional Errors-in-variables Models

    Authors: Alexandre Belloni, Mathieu Rosenbaum, Alexandre Tsybakov

    Abstract: We consider the linear regression model with observation error in the design. In this setting, we allow the number of covariates to be much larger than the sample size. Several new estimation methods have been recently introduced for this model. Indeed, the standard Lasso estimator or Dantzig selector turn out to become unreliable when only noisy regressors are available, which is quite common in… ▽ More

    Submitted 3 July, 2016; v1 submitted 1 August, 2014; originally announced August 2014.

  13. arXiv:1108.5116  [pdf, ps, other

    math.ST stat.ME

    Sparse Estimation by Exponential Weighting

    Authors: Philippe Rigollet, Alexandre B. Tsybakov

    Abstract: Consider a regression model with fixed design and Gaussian noise where the regression function can potentially be well approximated by a function that admits a sparse representation in a given dictionary. This paper resorts to exponential weights to exploit this underlying sparsity by implementing the principle of sparsity pattern aggregation. This model selection take on sparse estimation allows… ▽ More

    Submitted 7 January, 2013; v1 submitted 25 August, 2011; originally announced August 2011.

    Comments: Published in at http://dx.doi.org/10.1214/12-STS393 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-STS-STS393

    Journal ref: Statistical Science 2012, Vol. 27, No. 4, 558-575

  14. arXiv:1011.6256  [pdf, ps, other

    math.ST stat.ML

    Nuclear norm penalization and optimal rates for noisy low rank matrix completion

    Authors: Vladimir Koltchinskii, Alexandre B. Tsybakov, Karim Lounici

    Abstract: This paper deals with the trace regression model where $n$ entries or linear combinations of entries of an unknown $m_1\times m_2$ matrix $A_0$ corrupted by noise are observed. We propose a new nuclear norm penalized estimator of $A_0$ and establish a general sharp oracle inequality for this estimator for arbitrary values of $n,m_1,m_2$ under the condition of isometry in expectation. Then this met… ▽ More

    Submitted 23 March, 2016; v1 submitted 29 November, 2010; originally announced November 2010.

    MSC Class: 62J99; 62H12; 60B20; 60G15

  15. arXiv:0906.2885  [pdf, other

    stat.AP stat.ME

    Noisy Independent Factor Analysis Model for Density Estimation and Classification

    Authors: Umberto Amato, Anestis Antoniadis, Alexander Samarov, Alexander Tsybakov

    Abstract: We consider the problem of multivariate density estimation when the unknown density is assumed to follow a particular form of dimensionality reduction, a noisy independent factor analysis (IFA) model. In this model the data are generated by a number of latent independent components having unknown distributions and are observed in Gaussian noise. We do not assume that either the number of compone… ▽ More

    Submitted 16 June, 2009; originally announced June 2009.

  16. arXiv:0903.1468  [pdf, ps, other

    stat.ML math.ST

    Taking Advantage of Sparsity in Multi-Task Learning

    Authors: Karim Lounici, Massimiliano Pontil, Alexandre B. Tsybakov, Sara van de Geer

    Abstract: We study the problem of estimating multiple linear regression equations for the purpose of both prediction and variable selection. Following recent work on multi-task learning Argyriou et al. [2008], we assume that the regression vectors share the same sparsity pattern. This means that the set of relevant predictor variables is the same across the different equations. This assumption leads us to… ▽ More

    Submitted 8 March, 2009; originally announced March 2009.

    Journal ref: 10 pages, 1 figure, Proc. Computational Learning Theory Conference (COLT 2009)

  17. Sparse Regression Learning by Aggregation and Langevin Monte-Carlo

    Authors: Arnak Dalalyan, Alexandre B. Tsybakov

    Abstract: We consider the problem of regression learning for deterministic design and independent random errors. We start by proving a sharp PAC-Bayesian type bound for the exponentially weighted aggregate (EWA) under the expected squared empirical loss. For a broad class of noise distributions the presented bound is valid whenever the temperature parameter $β$ of the EWA is larger than or equal to… ▽ More

    Submitted 16 February, 2010; v1 submitted 6 March, 2009; originally announced March 2009.

    Comments: Short version published in COLT 2009

    Journal ref: Journal of Computer and System Sciences 78 (2012) 1423-1443

  18. arXiv:0901.2044  [pdf, ps, other

    math.ST stat.ML

    SPADES and mixture models

    Authors: Florentina Bunea, Alexandre B. Tsybakov, Marten H. Wegkamp, Adrian Barbu

    Abstract: This paper studies sparse density estimation via $\ell_1$ penalization (SPADES). We focus on estimation in high-dimensional mixture models and nonparametric adaptive density estimation. We show, respectively, that SPADES can recover, with high probability, the unknown components of a mixture of probability densities and that it yields minimax adaptive density estimates. These results are based on… ▽ More

    Submitted 21 October, 2010; v1 submitted 14 January, 2009; originally announced January 2009.

    Comments: Published in at http://dx.doi.org/10.1214/09-AOS790 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS790

    Journal ref: Annals of Statistics 2010, Vol. 38, No. 4, 2525-2558