-
Model-free identification in ill-posed regression
Authors:
Gianluca Finocchio,
Tatyana Krivobokova
Abstract:
The problem of parsimonious parameter identification in possibly high-dimensional linear regression with highly correlated features is addressed. This problem is formalized as the estimation of the best, in a certain sense, linear combinations of the features that are relevant to the response variable. Importantly, the dependence between the features and the response is allowed to be arbitrary. Ne…
▽ More
The problem of parsimonious parameter identification in possibly high-dimensional linear regression with highly correlated features is addressed. This problem is formalized as the estimation of the best, in a certain sense, linear combinations of the features that are relevant to the response variable. Importantly, the dependence between the features and the response is allowed to be arbitrary. Necessary and sufficient conditions for such parsimonious identification -- referred to as statistical interpretability -- are established for a broad class of linear dimensionality reduction algorithms. Sharp bounds on their estimation errors, with high probability, are derived. To our knowledge, this is the first formal framework that enables the definition and assessment of the interpretability of a broad class of algorithms. The results are specifically applied to methods based on sparse regression, unsupervised projection and sufficient reduction. The implications of employing such methods for prediction problems are discussed in the context of the prolific literature on overparametrized methods in the regime of benign overfitting.
△ Less
Submitted 2 May, 2025;
originally announced May 2025.
-
An extended latent factor framework for ill-posed linear regression
Authors:
Gianluca Finocchio,
Tatyana Krivobokova
Abstract:
The classical latent factor model for linear regression is extended by assuming that, up to an unknown orthogonal transformation, the features consist of subsets that are relevant and irrelevant for the response. Furthermore, a joint low-dimensionality is imposed only on the relevant features vector and the response variable. This framework allows for a comprehensive study of the partial-least-squ…
▽ More
The classical latent factor model for linear regression is extended by assuming that, up to an unknown orthogonal transformation, the features consist of subsets that are relevant and irrelevant for the response. Furthermore, a joint low-dimensionality is imposed only on the relevant features vector and the response variable. This framework allows for a comprehensive study of the partial-least-squares (PLS) algorithm under random design. In particular, a novel perturbation bound for PLS solutions is proven and the high-probability $L^2$-estimation rate for the PLS estimator is obtained. This novel framework also sheds light on the performance of other regularisation methods for ill-posed linear regression that exploit sparsity or unsupervised projection. The theoretical findings are confirmed by numerical studies on both real and simulated data.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Posterior contraction for deep Gaussian process priors
Authors:
Gianluca Finocchio,
Johannes Schmidt-Hieber
Abstract:
We study posterior contraction rates for a class of deep Gaussian process priors applied to the nonparametric regression problem under a general composition assumption on the regression function. It is shown that the contraction rates can achieve the minimax convergence rate (up to $\log n$ factors), while being adaptive to the underlying structure and smoothness of the target function. The propos…
▽ More
We study posterior contraction rates for a class of deep Gaussian process priors applied to the nonparametric regression problem under a general composition assumption on the regression function. It is shown that the contraction rates can achieve the minimax convergence rate (up to $\log n$ factors), while being adaptive to the underlying structure and smoothness of the target function. The proposed framework extends the Bayesian nonparametric theory for Gaussian process priors.
△ Less
Submitted 13 August, 2022; v1 submitted 16 May, 2021;
originally announced May 2021.
-
Robust-to-outliers square-root LASSO, simultaneous inference with a MOM approach
Authors:
G. Finocchio,
A. Derumigny,
K. Proksch
Abstract:
We consider the least-squares regression problem with unknown noise variance, where the observed data points are allowed to be corrupted by outliers. Building on the median-of-means (MOM) method introduced by Lecue and Lerasle Ann.Statist.48(2):906-931(April 2020) in the case of known noise variance, we propose a general MOM approach for simultaneous inference of both the regression function and t…
▽ More
We consider the least-squares regression problem with unknown noise variance, where the observed data points are allowed to be corrupted by outliers. Building on the median-of-means (MOM) method introduced by Lecue and Lerasle Ann.Statist.48(2):906-931(April 2020) in the case of known noise variance, we propose a general MOM approach for simultaneous inference of both the regression function and the noise variance, requiring only an upper bound on the noise level. Interestingly, this generalization requires care due to regularity issues that are intrinsic to the underlying convex-concave optimization problem. In the general case where the regression function belongs to a convex class, we show that our simultaneous estimator achieves with high probability the same convergence rates and a similar risk bound as if the noise level was unknown, as well as convergence rates for the estimated noise standard deviation.
In the high-dimensional sparse linear setting, our estimator yields a robust analog of the square-root LASSO. Under weak moment conditions, it jointly achieves with high probability the minimax rates of estimation $s^{1/p} \sqrt{(1/n) \log(p/s)}$ for the $\ell_p$-norm of the coefficient vector, and the rate $\sqrt{(s/n) \log(p/s)}$ for the estimation of the noise standard deviation. Here $n$ denotes the sample size, $p$ the dimension and $s$ the sparsity level. We finally propose an extension to the case of unknown sparsity level $s$, providing a jointly adaptive estimator $(\widetilde β, \widetilde σ, \widetilde s)$. It simultaneously estimates the coefficient vector, the noise level and the sparsity level, with proven bounds on each of these three components that hold with high probability.
△ Less
Submitted 18 March, 2021;
originally announced March 2021.
-
Bayesian variance estimation in the Gaussian sequence model with partial information on the means
Authors:
Gianluca Finocchio,
Johannes Schmidt-Hieber
Abstract:
Consider the Gaussian sequence model under the additional assumption that a fixed fraction of the means is known. We study the problem of variance estimation from a frequentist Bayesian perspective. The maximum likelihood estimator (MLE) for $σ^2$ is biased and inconsistent. This raises the question whether the posterior is able to correct the MLE in this case. By developing a new proving strategy…
▽ More
Consider the Gaussian sequence model under the additional assumption that a fixed fraction of the means is known. We study the problem of variance estimation from a frequentist Bayesian perspective. The maximum likelihood estimator (MLE) for $σ^2$ is biased and inconsistent. This raises the question whether the posterior is able to correct the MLE in this case. By developing a new proving strategy that uses refined properties of the posterior distribution, we find that the marginal posterior is inconsistent for any i.i.d. prior on the mean parameters. In particular, no assumption on the decay of the prior needs to be imposed. Surprisingly, we also find that consistency can be retained for a hierarchical prior based on Gaussian mixtures. In this case we also establish a limiting shape result and determine the limit distribution. In contrast to the classical Bernstein-von Mises theorem, the limit is non-Gaussian. We show that the Bayesian analysis leads to new statistical estimators outperforming the correctly calibrated MLE in a numerical simulation study.
△ Less
Submitted 18 December, 2019; v1 submitted 9 April, 2019;
originally announced April 2019.