-
Adaptive nonparametric estimation of a component density in a two-class mixture model
Authors:
Gaelle Chagny,
Antoine Channarond,
Van Ha Hoang,
Angelina Roche
Abstract:
A two-class mixture model, where the density of one of the components is known, is considered. We address the issue of the nonparametric adaptive estimation of the unknown probability density of the second component. We propose a randomly weighted kernel estimator with a fully data-driven bandwidth selection method, in the spirit of the Goldenshluger and Lepski method. An oracle-type inequality fo…
▽ More
A two-class mixture model, where the density of one of the components is known, is considered. We address the issue of the nonparametric adaptive estimation of the unknown probability density of the second component. We propose a randomly weighted kernel estimator with a fully data-driven bandwidth selection method, in the spirit of the Goldenshluger and Lepski method. An oracle-type inequality for the pointwise quadratic risk is derived as well as convergence rates over Holder smoothness classes. The theoretical results are illustrated by numerical simulations.
△ Less
Submitted 5 February, 2021; v1 submitted 30 July, 2020;
originally announced July 2020.
-
On the Estimation of Latent Distances Using Graph Distances
Authors:
Ery Arias-Castro,
Antoine Channarond,
Bruno Pelletier,
Nicolas Verzelen
Abstract:
We are given the adjacency matrix of a geometric graph and the task of recovering the latent positions. We study one of the most popular approaches which consists in using the graph distances and derive error bounds under various assumptions on the link function. In the simplest case where the link function is proportional to an indicator function, the bound matches an information lower bound that…
▽ More
We are given the adjacency matrix of a geometric graph and the task of recovering the latent positions. We study one of the most popular approaches which consists in using the graph distances and derive error bounds under various assumptions on the link function. In the simplest case where the link function is proportional to an indicator function, the bound matches an information lower bound that we derive.
△ Less
Submitted 11 August, 2020; v1 submitted 27 April, 2018;
originally announced April 2018.
-
Fast and Consistent Algorithm for the Latent Block Model
Authors:
Vincent Brault,
Antoine Channarond
Abstract:
The latent block model is used to simultaneously rank the rows and columns of a matrix to reveal a block structure. The algorithms used for estimation are often time consuming. However, recent work shows that the log-likelihood ratios are equivalent under the complete and observed (with unknown labels) models and the groups posterior distribution to converge as the size of the data increases to a…
▽ More
The latent block model is used to simultaneously rank the rows and columns of a matrix to reveal a block structure. The algorithms used for estimation are often time consuming. However, recent work shows that the log-likelihood ratios are equivalent under the complete and observed (with unknown labels) models and the groups posterior distribution to converge as the size of the data increases to a Dirac mass located at the actual groups configuration. Based on these observations, the algorithm $Largest$ $Gaps$ is proposed in this paper to perform clustering using only the marginals of the matrix, when the number of blocks is very small with respect to the size of the whole matrix in the case of binary data. In addition, a model selection method is incorporated with a proof of its consistency. Thus, this paper shows that studying simplistic configurations (few blocks compared to the size of the matrix or very contrasting blocks) with complex algorithms is useless since the marginals already give very good parameter and classification estimates.
△ Less
Submitted 9 March, 2023; v1 submitted 27 October, 2016;
originally announced October 2016.
-
Classification and estimation in the Stochastic Block Model based on the empirical degrees
Authors:
Antoine Channarond,
Jean-Jacques Daudin,
Stéphane Robin
Abstract:
The Stochastic Block Model (Holland et al., 1983) is a mixture model for heterogeneous network data. Unlike the usual statistical framework, new nodes give additional information about the previous ones in this model. Thereby the distribution of the degrees concentrates in points conditionally on the node class. We show under a mild assumption that classification, estimation and model selection ca…
▽ More
The Stochastic Block Model (Holland et al., 1983) is a mixture model for heterogeneous network data. Unlike the usual statistical framework, new nodes give additional information about the previous ones in this model. Thereby the distribution of the degrees concentrates in points conditionally on the node class. We show under a mild assumption that classification, estimation and model selection can actually be achieved with no more than the empirical degree data. We provide an algorithm able to process very large networks and consistent estimators based on it. In particular, we prove a bound of the probability of misclassification of at least one node, including when the number of classes grows.
△ Less
Submitted 29 October, 2011;
originally announced October 2011.