Search | arXiv e-print repository

doi 10.1214/18-EJS1479

A Quasi-Bayesian Perspective to Online Clustering

Authors: Le Li, Benjamin Guedj, Sébastien Loustau

Abstract: When faced with high frequency streams of data, clustering raises theoretical and algorithmic pitfalls. We introduce a new and adaptive online clustering algorithm relying on a quasi-Bayesian approach, with a dynamic (i.e., time-dependent) estimation of the (unknown and changing) number of clusters. We prove that our approach is supported by minimax regret bounds. We also provide an RJMCMC-flavore… ▽ More When faced with high frequency streams of data, clustering raises theoretical and algorithmic pitfalls. We introduce a new and adaptive online clustering algorithm relying on a quasi-Bayesian approach, with a dynamic (i.e., time-dependent) estimation of the (unknown and changing) number of clusters. We prove that our approach is supported by minimax regret bounds. We also provide an RJMCMC-flavored implementation (called PACBO, see https://cran.r-project.org/web/packages/PACBO/index.html) for which we give a convergence guarantee. Finally, numerical experiments illustrate the potential of our procedure. △ Less

Submitted 25 May, 2018; v1 submitted 1 February, 2016; originally announced February 2016.

Journal ref: Electronic Journal of Statistics (2018), vol. 12(2), 3071--3113

arXiv:1401.6882 [pdf, ps, other]

doi 10.1214/15-AOS1318

Bandwidth selection in kernel empirical risk minimization via the gradient

Authors: Michaël Chichignoud, Sébastien Loustau

Abstract: In this paper, we deal with the data-driven selection of multidimensional and possibly anisotropic bandwidths in the general framework of kernel empirical risk minimization. We propose a universal selection rule, which leads to optimal adaptive results in a large variety of statistical models such as nonparametric robust regression and statistical learning with errors in variables. These results a… ▽ More In this paper, we deal with the data-driven selection of multidimensional and possibly anisotropic bandwidths in the general framework of kernel empirical risk minimization. We propose a universal selection rule, which leads to optimal adaptive results in a large variety of statistical models such as nonparametric robust regression and statistical learning with errors in variables. These results are stated in the context of smooth loss functions, where the gradient of the risk appears as a good criterion to measure the performance of our estimators. The selection rule consists of a comparison of gradient empirical risks. It can be viewed as a nontrivial improvement of the so-called Goldenshluger-Lepski method to nonlinear estimators. Furthermore, one main advantage of our selection rule is the nondependency on the Hessian matrix of the risk, usually involved in standard adaptive procedures. △ Less

Submitted 18 August, 2015; v1 submitted 27 January, 2014; originally announced January 2014.

Comments: Published at http://dx.doi.org/10.1214/15-AOS1318 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOS-AOS1318

Journal ref: Annals of Statistics 2015, Vol. 43, No. 4, 1617-1646

arXiv:1307.3369 [pdf, ps, other]

Noisy classification with boundary assumptions

Authors: Sébastien Loustau, Clément Marteau

Abstract: We address the problem of classification when data are collected from two samples with measurement errors. This problem turns to be an inverse problem and requires a specific treatment. In this context, we investigate the minimax rates of convergence using both a margin assumption, and a smoothness condition on the boundary of the set associated to the Bayes classifier. We establish lower and uppe… ▽ More We address the problem of classification when data are collected from two samples with measurement errors. This problem turns to be an inverse problem and requires a specific treatment. In this context, we investigate the minimax rates of convergence using both a margin assumption, and a smoothness condition on the boundary of the set associated to the Bayes classifier. We establish lower and upper bounds (based on a deconvolution classifier) on these rates. △ Less

Submitted 12 July, 2013; originally announced July 2013.

Comments: arXiv admin note: substantial text overlap with arXiv:1201.3283

arXiv:1306.2194 [pdf, other]

Adaptive Noisy Clustering

Authors: Michael Chichignoud, Sébastien Loustau

Abstract: The problem of adaptive noisy clustering is investigated. Given a set of noisy observations $Z_i=X_i+ε_i$, $i=1,...,n$, the goal is to design clusters associated with the law of $X_i$'s, with unknown density $f$ with respect to the Lebesgue measure. Since we observe a corrupted sample, a direct approach as the popular {\it $k$-means} is not suitable in this case. In this paper, we propose a noisy… ▽ More The problem of adaptive noisy clustering is investigated. Given a set of noisy observations $Z_i=X_i+ε_i$, $i=1,...,n$, the goal is to design clusters associated with the law of $X_i$'s, with unknown density $f$ with respect to the Lebesgue measure. Since we observe a corrupted sample, a direct approach as the popular {\it $k$-means} is not suitable in this case. In this paper, we propose a noisy $k$-means minimization, which is based on the $k$-means loss function and a deconvolution estimator of the density $f$. In particular, this approach suffers from the dependence on a bandwidth involved in the deconvolution kernel. Fast rates of convergence for the excess risk are proposed for a particular choice of the bandwidth, which depends on the smoothness of the density $f$. Then, we turn out into the main issue of the paper: the data-driven choice of the bandwidth. We state an adaptive upper bound for a new selection rule, called ERC (Empirical Risk Comparison). This selection rule is based on the Lepski's principle, where empirical risks associated with different bandwidths are compared. Finally, we illustrate that this adaptive rule can be used in many statistical problems of $M$-estimation where the empirical risk depends on a nuisance parameter. △ Less

Submitted 10 June, 2013; originally announced June 2013.

Comments: 22 pages

arXiv:1305.0630 [pdf, ps, other]

Anisotropic oracle inequalities in noisy quantization

Authors: Sébastien Loustau

Abstract: The effect of errors in variables in quantization is investigated. We prove general exact and non-exact oracle inequalities with fast rates for an empirical minimization based on a noisy sample $Z_i=X_i+ε_i,i=1,\ldots,n$, where $X_i$ are i.i.d. with density $f$ and $ε_i$ are i.i.d. with density $η$. These rates depend on the geometry of the density $f$ and the asymptotic behaviour of the character… ▽ More The effect of errors in variables in quantization is investigated. We prove general exact and non-exact oracle inequalities with fast rates for an empirical minimization based on a noisy sample $Z_i=X_i+ε_i,i=1,\ldots,n$, where $X_i$ are i.i.d. with density $f$ and $ε_i$ are i.i.d. with density $η$. These rates depend on the geometry of the density $f$ and the asymptotic behaviour of the characteristic function of $η$. This general study can be applied to the problem of $k$-means clustering with noisy data. For this purpose, we introduce a deconvolution $k$-means stochastic minimization which reaches fast rates of convergence under standard Pollard's regularity assumptions. △ Less

Submitted 3 May, 2013; originally announced May 2013.

Comments: 30 pages. arXiv admin note: text overlap with arXiv:1205.1417

arXiv:1205.1417 [pdf, ps, other]

Fast rates for noisy clustering

Authors: Sébastien Loustau

Abstract: The effect of errors in variables in empirical minimization is investigated. Given a loss $l$ and a set of decision rules $\mathcal{G}$, we prove a general upper bound for an empirical minimization based on a deconvolution kernel and a noisy sample $Z_i=X_i+ε_i,i=1,...,n$. We apply this general upper bound to give the rate of convergence for the expected excess risk in noisy clustering. A recent b… ▽ More The effect of errors in variables in empirical minimization is investigated. Given a loss $l$ and a set of decision rules $\mathcal{G}$, we prove a general upper bound for an empirical minimization based on a deconvolution kernel and a noisy sample $Z_i=X_i+ε_i,i=1,...,n$. We apply this general upper bound to give the rate of convergence for the expected excess risk in noisy clustering. A recent bound from \citet{levrard} proves that this rate is $\mathcal{O}(1/n)$ in the direct case, under Pollard's regularity assumptions. Here the effect of noisy measurements gives a rate of the form $\mathcal{O}(1/n^{\fracγ{γ+2β}})$, where $γ$ is the Hölder regularity of the density of $X$ whereas $β$ is the degree of illposedness. △ Less

Submitted 7 May, 2012; originally announced May 2012.

arXiv:1201.6115 [pdf, ps, other]

Statistical learning with indirect observations

Authors: Sébastien Loustau

Abstract: Let $(X,Y)\in\mathcal{X}\times \mathcal{Y}$ be a random couple with unknown distribution $P$. Let $\GG$ be a class of measurable functions and $\ell$ a loss function. The problem of statistical learning deals with the estimation of the Bayes: $$g^*=\arg\min_{g\in\GG}\E_P \ell(g(X),Y). $$ In this paper, we study this problem when we deal with a contaminated sample $(Z_1,Y_1),..., (Z_n,Y_n)$ of i.i.… ▽ More Let $(X,Y)\in\mathcal{X}\times \mathcal{Y}$ be a random couple with unknown distribution $P$. Let $\GG$ be a class of measurable functions and $\ell$ a loss function. The problem of statistical learning deals with the estimation of the Bayes: $$g^*=\arg\min_{g\in\GG}\E_P \ell(g(X),Y). $$ In this paper, we study this problem when we deal with a contaminated sample $(Z_1,Y_1),..., (Z_n,Y_n)$ of i.i.d. indirect observations. Each input $Z_i$, $i=1,...,n$ is distributed from a density $Af$, where $A$ is a known compact linear operator and $f$ is the density of the direct input $X$. We derive fast rates of convergence for empirical risk minimizers based on regularization methods, such as deconvolution kernel density estimators or spectral cut-off. These results are comparable to the existing fast rates in Koltchinskii for the direct case. It gives some insights into the effect of indirect measurements in the presence of fast rates of convergence. △ Less

Submitted 10 July, 2012; v1 submitted 30 January, 2012; originally announced January 2012.

arXiv:1201.3283 [pdf, ps, other]

doi 10.3150/13-BEJ564

Minimax fast rates for discriminant analysis with errors in variables

Authors: Sébastien Loustau, Clément Marteau

Abstract: The effect of measurement errors in discriminant analysis is investigated. Given observations $Z=X+ε$, where $ε$ denotes a random noise, the goal is to predict the density of $X$ among two possible candidates $f$ and $g$. We suppose that we have at our disposal two learning samples. The aim is to approach the best possible decision rule $G^\star$ defined as a minimizer of the Bayes risk. In the fr… ▽ More The effect of measurement errors in discriminant analysis is investigated. Given observations $Z=X+ε$, where $ε$ denotes a random noise, the goal is to predict the density of $X$ among two possible candidates $f$ and $g$. We suppose that we have at our disposal two learning samples. The aim is to approach the best possible decision rule $G^\star$ defined as a minimizer of the Bayes risk. In the free-noise case $(ε=0)$, minimax fast rates of convergence are well-known under the margin assumption in discriminant analysis (see \cite{mammen}) or in the more general classification framework (see \cite{tsybakov2004,AT}). In this paper we intend to establish similar results in the noisy case, i.e. when dealing with errors in variables. We prove minimax lower bounds for this problem and explain how can these rates be attained, using in particular an Empirical Risk Minimizer (ERM) method based on deconvolution kernel estimators. △ Less

Submitted 12 May, 2015; v1 submitted 16 January, 2012; originally announced January 2012.

Journal ref: Bernoulli, Bernoulli Society for Mathematical Statistics and Probability, 2015, pp.30

Showing 1–8 of 8 results for author: Loustau, S