-
Estimating the conditional density by histogram type estimators and model selection
Authors:
Mathieu Sart
Abstract:
We propose a new estimation procedure of the conditional density for independent and identically distributed data. Our procedure aims at using the data to select a function among arbitrary (at most countable) collections of candidates. By using a deterministic Hellinger distance as loss, we prove that the selected function satisfies a non-asymptotic oracle type inequality under minimal assumptions…
▽ More
We propose a new estimation procedure of the conditional density for independent and identically distributed data. Our procedure aims at using the data to select a function among arbitrary (at most countable) collections of candidates. By using a deterministic Hellinger distance as loss, we prove that the selected function satisfies a non-asymptotic oracle type inequality under minimal assumptions on the statistical setting. We derive an adaptive piecewise constant estimator on a random partition that achieves the expected rate of convergence over (possibly inhomogeneous and anisotropic) Besov spaces of small regularity. Moreover, we show that this oracle inequality may lead to a general model selection theorem under very mild assumptions on the statistical setting. This theorem guarantees the existence of estimators possessing nice statistical properties under various assumptions on the conditional density (such as smoothness or structural ones).
△ Less
Submitted 25 October, 2016; v1 submitted 22 December, 2015;
originally announced December 2015.
-
A new method for estimation and model selection: $ρ$-estimation
Authors:
Yannick Baraud,
Lucien Birgé,
Mathieu Sart
Abstract:
The aim of this paper is to present a new estimation procedure that can be applied in many statistical frameworks including density and regression and which leads to both robust and optimal (or nearly optimal) estimators. In density estimation, they asymptotically coincide with the celebrated maximum likelihood estimators at least when the statistical model is regular enough and contains the true…
▽ More
The aim of this paper is to present a new estimation procedure that can be applied in many statistical frameworks including density and regression and which leads to both robust and optimal (or nearly optimal) estimators. In density estimation, they asymptotically coincide with the celebrated maximum likelihood estimators at least when the statistical model is regular enough and contains the true density to estimate. For very general models of densities, including non-compact ones, these estimators are robust with respect to the Hellinger distance and converge at optimal rate (up to a possible logarithmic factor) in all cases we know. In the regression setting, our approach improves upon the classical least squares from many aspects. In simple linear regression for example, it provides an estimation of the coefficients that are both robust to outliers and simultaneously rate-optimal (or nearly rate-optimal) for large class of error distributions including Gaussian, Laplace, Cauchy and uniform among others.
△ Less
Submitted 1 June, 2016; v1 submitted 24 March, 2014;
originally announced March 2014.
-
Robust estimation on a parametric model via testing
Authors:
Mathieu Sart
Abstract:
We are interested in the problem of robust parametric estimation of a density from $n$ i.i.d. observations. By using a practice-oriented procedure based on robust tests, we build an estimator for which we establish non-asymptotic risk bounds with respect to the Hellinger distance under mild assumptions on the parametric model. We show that the estimator is robust even for models for which the maxi…
▽ More
We are interested in the problem of robust parametric estimation of a density from $n$ i.i.d. observations. By using a practice-oriented procedure based on robust tests, we build an estimator for which we establish non-asymptotic risk bounds with respect to the Hellinger distance under mild assumptions on the parametric model. We show that the estimator is robust even for models for which the maximum likelihood method is bound to fail. A numerical simulation illustrates its robustness properties. When the model is true and regular enough, we prove that the estimator is very close to the maximum likelihood one, at least when the number of observations $n$ is large. In particular, it inherits its efficiency. Simulations show that these two estimators are almost equal with large probability, even for small values of $n$ when the model is regular enough and contains the true density.
△ Less
Submitted 30 March, 2016; v1 submitted 13 August, 2013;
originally announced August 2013.
-
Estimation of the transition density of a Markov chain
Authors:
Mathieu Sart
Abstract:
We present two data-driven procedures to estimate the transition density of an homogeneous Markov chain. The first yields to a piecewise constant estimator on a suitable random partition. By using an Hellinger-type loss, we establish non-asymptotic risk bounds for our estimator when the square root of the transition density belongs to possibly inhomogeneous Besov spaces with possibly small regular…
▽ More
We present two data-driven procedures to estimate the transition density of an homogeneous Markov chain. The first yields to a piecewise constant estimator on a suitable random partition. By using an Hellinger-type loss, we establish non-asymptotic risk bounds for our estimator when the square root of the transition density belongs to possibly inhomogeneous Besov spaces with possibly small regularity index. Some simulations are also provided. The second procedure is of theoretical interest and leads to a general model selection theorem from which we derive rates of convergence over a very wide range of possibly inhomogeneous and anisotropic Besov spaces. We also investigate the rates that can be achieved under structural assumptions on the transition density.
△ Less
Submitted 18 October, 2012;
originally announced October 2012.
-
Model selection for Poisson processes with covariates
Authors:
Mathieu Sart
Abstract:
We observe $n$ inhomogeneous Poisson processes with covariates and aim at estimating their intensities. We assume that the intensity of each Poisson process is of the form $s (\cdot, x)$ where $x$ is the covariate and where $s$ is an unknown function. We propose a model selection approach where the models are used to approximate the multivariate function $s$. We show that our estimator satisfies a…
▽ More
We observe $n$ inhomogeneous Poisson processes with covariates and aim at estimating their intensities. We assume that the intensity of each Poisson process is of the form $s (\cdot, x)$ where $x$ is the covariate and where $s$ is an unknown function. We propose a model selection approach where the models are used to approximate the multivariate function $s$. We show that our estimator satisfies an oracle-type inequality under very weak assumptions both on the intensities and the models. By using an Hellinger-type loss, we establish non-asymptotic risk bounds and specify them under several kind of assumptions on the target function $s$ such as being smooth or a product function. Besides, we show that our estimation procedure is robust with respect to these assumptions.
△ Less
Submitted 13 June, 2013; v1 submitted 23 December, 2011;
originally announced December 2011.