Search | arXiv e-print repository

Sample-optimal learning of quantum states using gentle measurements

Authors: Cristina Butucea, Jan Johannes, Henning Stein

Abstract: Gentle measurements of quantum states do not entirely collapse the initial state. Instead, they provide a post-measurement state at a prescribed trace distance $α$ from the initial state together with a random variable used for quantum learning of the initial state. We introduce here the class of $α-$locally-gentle measurements ($α-$LGM) on a finite dimensional quantum system which are product mea… ▽ More Gentle measurements of quantum states do not entirely collapse the initial state. Instead, they provide a post-measurement state at a prescribed trace distance $α$ from the initial state together with a random variable used for quantum learning of the initial state. We introduce here the class of $α-$locally-gentle measurements ($α-$LGM) on a finite dimensional quantum system which are product measurements on product states and prove a strong quantum Data-Processing Inequality (qDPI) on this class using an improved relation between gentleness and quantum differential privacy. We further show a gentle quantum Neyman-Pearson lemma which implies that our qDPI is asymptotically optimal (for small $α$). This inequality is employed to show that the necessary number of quantum states for prescribed accuracy $ε$ is of order $1/(ε^2 α^2)$ for both quantum tomography and quantum state certification. Finally, we propose an $α-$LGM called quantum Label Switch that attains these bounds. It is a general implementable method to turn any two-outcome measurement into an $α-$LGM. △ Less

Submitted 30 May, 2025; originally announced May 2025.

MSC Class: 81P15 (Primary); 68P27 (Secondary)

arXiv:2504.00919 [pdf, ps, other]

Nonparametric spectral density estimation using interactive mechanisms under local differential privacy

Authors: Cristina Butucea, Karolina Klockmann, Tatyana Krivobokova

Abstract: We address the problem of nonparametric estimation of the spectral density for a centered stationary Gaussian time series under local differential privacy constraints. Specifically, we propose new interactive privacy mechanisms for three tasks: estimating a single covariance coefficient, estimating the spectral density at a fixed frequency, and estimating the entire spectral density function. Our… ▽ More We address the problem of nonparametric estimation of the spectral density for a centered stationary Gaussian time series under local differential privacy constraints. Specifically, we propose new interactive privacy mechanisms for three tasks: estimating a single covariance coefficient, estimating the spectral density at a fixed frequency, and estimating the entire spectral density function. Our approach achieves faster rates through a two-stage process: we apply first the Laplace mechanism to the truncated value and then use the former privatized sample to gain knowledge on the dependence mechanism in the time series. For spectral densities belonging to Hölder and Sobolev smoothness classes, we demonstrate that our estimators improve upon the non-interactive mechanism of Kroll (2024) for small privacy parameter $α$, since the pointwise rates depend on $nα^2$ instead of $nα^4$. Moreover, we show that the rate $(nα^4)^{-1}$ is optimal for estimating a covariance coefficient with non-interactive mechanisms. However, the $L_2$ rate of our interactive estimator is slower than the pointwise rate. We show how to use these estimators to provide a bona-fide locally differentially private covariance matrix estimator. △ Less

Submitted 1 April, 2025; originally announced April 2025.

Comments: 47 pages

arXiv:2410.05751 [pdf, ps, other]

Asymptotic Equivalence of Locally Stationary Processes and Bivariate White Noise

Authors: Cristina Butucea, Alexander Meister, Angelika Rohde

Abstract: We consider a general class of statistical experiments, in which an $n$-dimensional centered Gaussian random variable is observed and its covariance matrix is the parameter of interest. The covariance matrix is assumed to be well-approximable in a linear space of lower dimension $K_n$ with eigenvalues uniformly bounded away from zero and infinity. We prove asymptotic equivalence of this experiment… ▽ More We consider a general class of statistical experiments, in which an $n$-dimensional centered Gaussian random variable is observed and its covariance matrix is the parameter of interest. The covariance matrix is assumed to be well-approximable in a linear space of lower dimension $K_n$ with eigenvalues uniformly bounded away from zero and infinity. We prove asymptotic equivalence of this experiment and a class of $K_n$-dimensional Gaussian models with informative expectation in Le Cam's sense when $n$ tends to infinity and $K_n$ is allowed to increase moderately in $n$ at a polynomial rate. For this purpose we derive a new localization technique for non-i.i.d. data and a novel high-dimensional Central Limit Law in total variation distance. These results are key ingredients to show asymptotic equivalence between the experiments of locally stationary Gaussian time series and a bivariate Wiener process with the log spectral density as its drift. Therein a novel class of matrices is introduced which generalizes circulant Toeplitz matrices traditionally used for strictly stationary time series. △ Less

Submitted 16 January, 2025; v1 submitted 8 October, 2024; originally announced October 2024.

MSC Class: 62B15; 62M10; 60F05

arXiv:2303.04694 [pdf, other]

Two-sided Matrix Regression

Authors: Nayel Bettache, Cristina Butucea

Abstract: The two-sided matrix regression model $Y = A^*X B^* +E$ aims at predicting $Y$ by taking into account both linear links between column features of $X$, via the unknown matrix $B^*$, and also among the row features of $X$, via the matrix $A^*$. We propose low-rank predictors in this high-dimensional matrix regression model via rank-penalized and nuclear norm-penalized least squares. Both criteria a… ▽ More The two-sided matrix regression model $Y = A^*X B^* +E$ aims at predicting $Y$ by taking into account both linear links between column features of $X$, via the unknown matrix $B^*$, and also among the row features of $X$, via the matrix $A^*$. We propose low-rank predictors in this high-dimensional matrix regression model via rank-penalized and nuclear norm-penalized least squares. Both criteria are non jointly convex; however, we propose explicit predictors based on SVD and show optimal prediction bounds. We give sufficient conditions for consistent rank selector. We also propose a fully data-driven rank-adaptive procedure. Simulation results confirm the good prediction and the rank-consistency results under data-driven explicit choices of the tuning parameters and the scaling parameter of the noise. △ Less

Submitted 8 March, 2023; originally announced March 2023.

Comments: 21 pages, 2 figures

arXiv:2212.01169 [pdf, ps, other]

Off-the-grid prediction and testing for linear combination of translated features

Authors: Cristina Butucea, Jean-François Delmas, Anne Dutfoy, Clément Hardy

Abstract: We consider a model where a signal (discrete or continuous) is observed with an additive Gaussian noise process. The signal is issued from a linear combination of a finite but increasing number of translated features. The features are continuously parameterized by their location and depend on some scale parameter. First, we extend previous prediction results for off-the-grid estimators by taking i… ▽ More We consider a model where a signal (discrete or continuous) is observed with an additive Gaussian noise process. The signal is issued from a linear combination of a finite but increasing number of translated features. The features are continuously parameterized by their location and depend on some scale parameter. First, we extend previous prediction results for off-the-grid estimators by taking into account here that the scale parameter may vary. The prediction bounds are analogous, but we improve the minimal distance between two consecutive features locations in order to achieve these bounds. Next, we propose a goodness-of-fit test for the model and give non-asymptotic upper bounds of the testing risk and of the minimax separation rate between two distinguishable signals. In particular, our test encompasses the signal detection framework. We deduce upper bounds on the minimal energy,expressed as the $\ell_2$-norm of the linear coefficients, to successfully detect a signal in presence of noise. The general model considered in this paper is a non-linear extension of the classical high-dimensional regression model. It turns out that,in this framework, our upper bound on the minimax separation rate matches (up to a logarithmic factor) the lower bound on the minimax separation rate for signal detection in the high-dimensional linear model associated to a fixed dictionary of features. We also propose a procedure to test whether the features of the observed signal belong to a given finite collection under the assumption that the linear coefficients may vary, but have prescribed signs under the null hypothesis. A non-asymptotic upper bound on the testing risk is given.We illustrate our results on the spikes deconvolution model with Gaussian features on the real line and with the Dirichlet kernel, frequently used in the compressed sensing literature, on the torus. △ Less

Submitted 22 July, 2024; v1 submitted 2 December, 2022; originally announced December 2022.

arXiv:2210.16311 [pdf, other]

Simultaneous off-the-grid learning of mixtures issued from a continuous dictionary

Authors: Cristina Butucea, Jean-François Delmas, Anne Dutfoy, Clément Hardy

Abstract: In this paper we observe a set, possibly a continuum, of signals corrupted by noise. Each signal is a finite mixture of an unknown number of features belonging to a continuous dictionary. The continuous dictionary is parametrized by a real non-linear parameter. We shall assume that the signals share an underlying structure by assuming that each signal has its active features included in a finite a… ▽ More In this paper we observe a set, possibly a continuum, of signals corrupted by noise. Each signal is a finite mixture of an unknown number of features belonging to a continuous dictionary. The continuous dictionary is parametrized by a real non-linear parameter. We shall assume that the signals share an underlying structure by assuming that each signal has its active features included in a finite and sparse set. We formulate regularized optimization problem to estimate simultaneously the linear coefficients in the mixtures and the non-linear parameters of the features. The optimization problem is composed of a data fidelity term and a $(\ell_1,L^p)$-penalty. We call its solution the Group-Nonlinear-Lasso and provide high probability bounds on the prediction error using certificate functions. Following recent works on the geometry of off-the-grid methods, we show that such functions can be constructed provided the parameters of the active features are pairwise separated by a constant with respect to a Riemannian metric.When the number of signals is finite and the noise is assumed Gaussian, we give refinements of our results for $p=1$ and $p=2$ using tail bounds on suprema of Gaussian and $χ^2$ random processes. When $p=2$, our prediction error reaches the rates obtained by the Group-Lasso estimator in the multi-task linear regression model. Furthermore, for $p=2$ these prediction rates are faster than for $p=1$ when all signals share most of the non-linear parameters. △ Less

Submitted 23 February, 2024; v1 submitted 27 October, 2022; originally announced October 2022.

arXiv:2207.00171 [pdf, ps, other]

Off-the-grid learning of mixtures from a continuous dictionary

Authors: Cristina Butucea, Jean-François Delmas, Anne Dutfoy, Clément Hardy

Abstract: We consider a general non-linear model where the signal is a finite mixture of an unknown, possibly increasing, number of features issued from a continuous dictionary parameterized by a real non-linear parameter. The signal is observed with Gaussian (possibly correlated) noise in either a continuous or a discrete setup. We propose an off-the-grid optimization method, that is, a method which does n… ▽ More We consider a general non-linear model where the signal is a finite mixture of an unknown, possibly increasing, number of features issued from a continuous dictionary parameterized by a real non-linear parameter. The signal is observed with Gaussian (possibly correlated) noise in either a continuous or a discrete setup. We propose an off-the-grid optimization method, that is, a method which does not use any discretization scheme on the parameter space, to estimate both the non-linear parameters of the features and the linear parameters of the mixture. We use recent results on the geometry of off-the-grid methods to give minimal separation on the true underlying non-linear parameters such that interpolating certificate functions can be constructed. Using also tail bounds for suprema of Gaussian processes we bound the prediction error with high probability. Assuming that the certificate functions can be constructed, our prediction error bound is up to $\log$-factors similar to the rates attained by the Lasso predictor in the linear regression model. We also establish convergence rates that quantify with high probability the quality of estimation for both the linear and the non-linear parameters. We develop in full details our main results for two applications: the Gaussian spike deconvolution and the scaled exponential model. △ Less

Submitted 9 April, 2025; v1 submitted 29 June, 2022; originally announced July 2022.

arXiv:2112.15042 [pdf, other]

Variable selection, monotone likelihood ratio and group sparsity

Authors: Cristina Butucea, Enno Mammen, Mohamed Ndaoud, Alexandre B. Tsybakov

Abstract: In the pivotal variable selection problem, we derive the exact non-asymptotic minimax selector over the class of all $s$-sparse vectors, which is also the Bayes selector with respect to the uniform prior. While this optimal selector is, in general, not realizable in polynomial time, we show that its tractable counterpart (the scan selector) attains the minimax expected Hamming risk to within facto… ▽ More In the pivotal variable selection problem, we derive the exact non-asymptotic minimax selector over the class of all $s$-sparse vectors, which is also the Bayes selector with respect to the uniform prior. While this optimal selector is, in general, not realizable in polynomial time, we show that its tractable counterpart (the scan selector) attains the minimax expected Hamming risk to within factor 2, and is also exact minimax with respect to the probability of wrong recovery. As a consequence, we establish explicit lower bounds under the monotone likelihood ratio property and we obtain a tight characterization of the minimax risk in terms of the best separable selector risk. We apply these general results to derive necessary and sufficient conditions of exact and almost full recovery in the location model with light tail distributions and in the problem of group variable selection under Gaussian noise. △ Less

Submitted 30 December, 2021; originally announced December 2021.

arXiv:2107.03940 [pdf, ps, other]

Locally differentially private estimation of nonlinear functionals of discrete distributions

Authors: Cristina Butucea, Yann Issartel

Abstract: We study the problem of estimating non-linear functionals of discrete distributions in the context of local differential privacy. The initial data $x_1,\ldots,x_n \in [K]$ are supposed i.i.d. and distributed according to an unknown discrete distribution $p = (p_1,\ldots,p_K)$. Only $α$-locally differentially private (LDP) samples $z_1,...,z_n$ are publicly available, where the term 'local' means t… ▽ More We study the problem of estimating non-linear functionals of discrete distributions in the context of local differential privacy. The initial data $x_1,\ldots,x_n \in [K]$ are supposed i.i.d. and distributed according to an unknown discrete distribution $p = (p_1,\ldots,p_K)$. Only $α$-locally differentially private (LDP) samples $z_1,...,z_n$ are publicly available, where the term 'local' means that each $z_i$ is produced using one individual attribute $x_i$. We exhibit privacy mechanisms (PM) that are interactive (i.e. they are allowed to use already published confidential data) or non-interactive. We describe the behavior of the quadratic risk for estimating the power sum functional $F_γ = \sum_{k=1}^K p_k^γ$, $γ>0$ as a function of $K, \, n$ and $α$. In the non-interactive case, we study two plug-in type estimators of $F_γ$, for all $γ>0$, that are similar to the MLE analyzed by Jiao et al. (2017) in the multinomial model. However, due to the privacy constraint the rates we attain are slower and similar to those obtained in the Gaussian model by Collier et al. (2020). In the interactive case, we introduce for all $γ>1$ a two-step procedure which attains the faster parametric rate $(n α^2)^{-1/2}$ when $γ\geq 2$. We give lower bounds results over all $α$-LDP mechanisms and all estimators using the private samples. △ Less

Submitted 12 August, 2023; v1 submitted 8 July, 2021; originally announced July 2021.

arXiv:2107.02439 [pdf, ps, other]

Goodness-of-fit testing for Hölder continuous densities under local differential privacy

Authors: Amandine Dubois, Thomas Berrett, Cristina Butucea

Abstract: We address the problem of goodness-of-fit testing for Hölder continuous densities under local differential privacy constraints. We study minimax separation rates when only non-interactive privacy mechanisms are allowed to be used and when both non-interactive and sequentially interactive can be used for privatisation. We propose privacy mechanisms and associated testing procedures whose analysis e… ▽ More We address the problem of goodness-of-fit testing for Hölder continuous densities under local differential privacy constraints. We study minimax separation rates when only non-interactive privacy mechanisms are allowed to be used and when both non-interactive and sequentially interactive can be used for privatisation. We propose privacy mechanisms and associated testing procedures whose analysis enables us to obtain upper bounds on the minimax rates. These results are complemented with lower bounds. By comparing these bounds, we show that the proposed privacy mechanisms and tests are optimal up to at most a logarithmic factor for several choices of $f_0$ including densities from uniform, normal, Beta, Cauchy, Pareto, exponential distributions. In particular, we observe that the results are deteriorated in the private setting compared to the non-private one. Moreover, we show that sequentially interactive mechanisms improve upon the results obtained when considering only non-interactive privacy mechanisms. △ Less

Submitted 6 July, 2021; originally announced July 2021.

arXiv:2102.06817 [pdf, other]

Fast Non-Asymptotic Testing And Support Recovery For Large Sparse Toeplitz Covariance Matrices

Authors: Nayel Bettache, Cristina Butucea, Marianne Sorba

Abstract: We consider $n$ independent $p$-dimensional Gaussian vectors with covariance matrix having Toeplitz structure. We test that these vectors have independent components against a stationary distribution with sparse Toeplitz covariance matrix, and also select the support of non-zero entries. We assume that the non-zero values can occur in the recent past (time-lag less than $p/2$). We build test proce… ▽ More We consider $n$ independent $p$-dimensional Gaussian vectors with covariance matrix having Toeplitz structure. We test that these vectors have independent components against a stationary distribution with sparse Toeplitz covariance matrix, and also select the support of non-zero entries. We assume that the non-zero values can occur in the recent past (time-lag less than $p/2$). We build test procedures that combine a sum and a scan-type procedures, but are computationally fast, and show their non-asymptotic behaviour in both one-sided (only positive correlations) and two-sided alternatives, respectively. We also exhibit a selector of significant lags and bound the Hamming-loss risk of the estimated support. These results can be extended to the case of nearly Toeplitz covariance structure and to sub-Gaussian vectors. Numerical results illustrate the excellent behaviour of both test procedures and support selectors - larger the dimension $p$, faster are the rates. △ Less

Submitted 12 February, 2021; originally announced February 2021.

arXiv:2011.14881 [pdf, ps, other]

Phase transitions for support recovery under local differential privacy

Authors: Cristina Butucea, Amandine Dubois, Adrien Saumard

Abstract: We address the problem of variable selection in a high-dimensional but sparse mean model, under the additional constraint that only privatised data are available for inference. The original data are vectors with independent entries having a symmetric, strongly log-concave distribution on $\mathbb{R}$. For this purpose, we adopt a recent generalisation of classical minimax theory to the framework o… ▽ More We address the problem of variable selection in a high-dimensional but sparse mean model, under the additional constraint that only privatised data are available for inference. The original data are vectors with independent entries having a symmetric, strongly log-concave distribution on $\mathbb{R}$. For this purpose, we adopt a recent generalisation of classical minimax theory to the framework of local $α-$differential privacy. We provide lower and upper bounds on the rate of convergence for the expected Hamming loss over classes of at most $s$-sparse vectors whose non-zero coordinates are separated from $0$ by a constant $a>0$. As corollaries, we derive necessary and sufficient conditions (up to log factors) for exact recovery and for almost full recovery. When we restrict our attention to non-interactive mechanisms that act independently on each coordinate our lower bound shows that, contrary to the non-private setting, both exact and almost full recovery are impossible whatever the value of $a$ in the high-dimensional regime such that $n α^2/ d^2\lesssim 1$. However, in the regime $nα^2/d^2\gg \log(d)$ we can exhibit a critical value $a^*$ (up to a logarithmic factor) such that exact and almost full recovery are possible for all $a\gg a^*$ and impossible for $a\leq a^*$. We show that these results can be improved when allowing for all non-interactive (that act globally on all coordinates) locally $α-$differentially private mechanisms in the sense that phase transitions occur at lower levels. △ Less

Submitted 29 June, 2022; v1 submitted 30 November, 2020; originally announced November 2020.

MSC Class: 62G05; 62G20

arXiv:2005.12601 [pdf, ps, other]

Locally private non-asymptotic testing of discrete distributions is faster using interactive mechanisms

Authors: Thomas B. Berrett, Cristina Butucea

Abstract: We find separation rates for testing multinomial or more general discrete distributions under the constraint of local differential privacy. We construct efficient randomized algorithms and test procedures, in both the case where only non-interactive privacy mechanisms are allowed and also in the case where all sequentially interactive privacy mechanisms are allowed. The separation rates are faster… ▽ More We find separation rates for testing multinomial or more general discrete distributions under the constraint of local differential privacy. We construct efficient randomized algorithms and test procedures, in both the case where only non-interactive privacy mechanisms are allowed and also in the case where all sequentially interactive privacy mechanisms are allowed. The separation rates are faster in the latter case. We prove general information theoretical bounds that allow us to establish the optimality of our algorithms among all pairs of privacy mechanisms and test procedures, in most usual cases. Considered examples include testing uniform, polynomially and exponentially decreasing distributions. △ Less

Submitted 26 May, 2020; originally announced May 2020.

Comments: 31 pages

arXiv:2003.04773 [pdf, other]

Interactive versus non-interactive locally differentially private estimation: Two elbows for the quadratic functional

Authors: Cristina Butucea, Angelika Rohde, Lukas Steinberger

Abstract: Local differential privacy has recently received increasing attention from the statistics community as a valuable tool to protect the privacy of individual data owners without the need of a trusted third party. Similar to the classical notion of randomized response, the idea is that data owners randomize their true information locally and only release the perturbed data. Many different protocols f… ▽ More Local differential privacy has recently received increasing attention from the statistics community as a valuable tool to protect the privacy of individual data owners without the need of a trusted third party. Similar to the classical notion of randomized response, the idea is that data owners randomize their true information locally and only release the perturbed data. Many different protocols for such local perturbation procedures can be designed. In most estimation problems studied in the literature so far, however, no significant difference in terms of minimax risk between purely non-interactive protocols and protocols that allow for some amount of interaction between individual data providers could be observed. In this paper we show that for estimating the integrated square of a density, sequentially interactive procedures improve substantially over the best possible non-interactive procedure in terms of minimax rate of estimation. In particular, in the non-interactive scenario we identify an elbow in the minimax rate at $s=\frac34$, whereas in the sequentially interactive scenario the elbow is at $s=\frac12$. This is markedly different from both, the case of direct observations, where the elbow is well known to be at $s=\frac14$, as well as from the case where Laplace noise is added to the original data, where an elbow at $s= \frac94$ is obtained. We also provide adaptive estimators that achieve the optimal rate up to log-factors, we draw connections to non-parametric goodness-of-fit testing and estimation of more general integral functionals and conduct a series of numerical experiments. The fact that a particular locally differentially private, but interactive, mechanism improves over the simple non-interactive one is also of great importance for practical implementations of local differential privacy. △ Less

Submitted 1 July, 2022; v1 submitted 10 March, 2020; originally announced March 2020.

arXiv:1912.04629 [pdf, ps, other]

Classification under local differential privacy

Authors: Thomas Berrett, Cristina Butucea

Abstract: We consider the binary classification problem in a setup that preserves the privacy of the original sample. We provide a privacy mechanism that is locally differentially private and then construct a classifier based on the private sample that is universally consistent in Euclidean spaces. Under stronger assumptions, we establish the minimax rates of convergence of the excess risk and see that they… ▽ More We consider the binary classification problem in a setup that preserves the privacy of the original sample. We provide a privacy mechanism that is locally differentially private and then construct a classifier based on the private sample that is universally consistent in Euclidean spaces. Under stronger assumptions, we establish the minimax rates of convergence of the excess risk and see that they are slower than in the case when the original sample is available. △ Less

Submitted 10 December, 2019; originally announced December 2019.

Comments: 12 pages

MSC Class: 62G08

arXiv:1903.01927 [pdf, ps, other]

Local differential privacy: Elbow effect in optimal density estimation and adaptation over Besov ellipsoids

Authors: Cristina Butucea, Amandine Dubois, Martin Kroll, Adrien Saumard

Abstract: We address the problem of non-parametric density estimation under the additional constraint that only privatised data are allowed to be published and available for inference. For this purpose, we adopt a recent generalisation of classical minimax theory to the framework of local $α$-differential privacy and provide a lower bound on the rate of convergence over Besov spaces $B^s_{pq}$ under mean in… ▽ More We address the problem of non-parametric density estimation under the additional constraint that only privatised data are allowed to be published and available for inference. For this purpose, we adopt a recent generalisation of classical minimax theory to the framework of local $α$-differential privacy and provide a lower bound on the rate of convergence over Besov spaces $B^s_{pq}$ under mean integrated $\mathbb L^r$-risk. This lower bound is deteriorated compared to the standard setup without privacy, and reveals a twofold elbow effect. In order to fulfil the privacy requirement, we suggest adding suitably scaled Laplace noise to empirical wavelet coefficients. Upper bounds within (at most) a logarithmic factor are derived under the assumption that $α$ stays bounded as $n$ increases: A linear but non-adaptive wavelet estimator is shown to attain the lower bound whenever $p \geq r$ but provides a slower rate of convergence otherwise. An adaptive non-linear wavelet estimator with appropriately chosen smoothing parameters and thresholding is shown to attain the lower bound within a logarithmic factor for all cases. △ Less

Submitted 5 March, 2019; originally announced March 2019.

MSC Class: 62G07 (primary); 62G20 (secondary)

arXiv:1705.03445 [pdf, ps, other]

doi 10.1214/17-AOS1672

Local asymptotic equivalence of pure quantum states ensembles and quantum Gaussian white noise

Authors: Cristina Butucea, Madalin Guta, Michael Nussbaum

Abstract: Quantum technology is increasingly relying on specialised statistical inference methods for analysing quantum measurement data. This motivates the development of "quantum statistics", a field that is shaping up at the overlap of quantum physics and "classical" statistics. One of the less investigated topics to date is that of statistical inference for infinite dimensional quantum systems, which ca… ▽ More Quantum technology is increasingly relying on specialised statistical inference methods for analysing quantum measurement data. This motivates the development of "quantum statistics", a field that is shaping up at the overlap of quantum physics and "classical" statistics. One of the less investigated topics to date is that of statistical inference for infinite dimensional quantum systems, which can be seen as quantum counterpart of non-parametric statistics. In this paper we analyse the asymptotic theory of quantum statistical models consisting of ensembles of quantum systems which are identically prepared in a pure state. In the limit of large ensembles we establish the local asymptotic equivalence (LAE) of this i.i.d. model to a quantum Gaussian white noise model. We use the LAE result in order to establish minimax rates for the estimation of pure states belonging to Hermite-Sobolev classes of wave functions. Moreover, for quadratic functional estimation of the same states we note an elbow effect in the rates, whereas for testing a pure state a sharp parametric rate is attained over the nonparametric Hermite-Sobolev class. △ Less

Submitted 4 May, 2023; v1 submitted 9 May, 2017; originally announced May 2017.

Journal ref: Ann. Statist. 46(6B): 3676-3706, 2018

arXiv:1604.06304 [pdf, other]

Fast adaptive estimation of log-additive exponential models in Kullback-Leibler divergence

Authors: Cristina Butucea, Jean-François Delmas, Anne Dutfoy, Richard Fischer

Abstract: We study the problem of nonparametric estimation of density functions with a product form on the domain $\triangle=\{( x_1, \ldots, x_d)\in \mathbb{R}^d, 0\leq x_1\leq \dots \leq x_d \leq 1\}$. Such densities appear in the random truncation model as the joint density function of observations. They are also obtained as maximum entropy distributions of order statistics with given marginals. We propo… ▽ More We study the problem of nonparametric estimation of density functions with a product form on the domain $\triangle=\{( x_1, \ldots, x_d)\in \mathbb{R}^d, 0\leq x_1\leq \dots \leq x_d \leq 1\}$. Such densities appear in the random truncation model as the joint density function of observations. They are also obtained as maximum entropy distributions of order statistics with given marginals. We propose an estimation method based on the approximation of the logarithm of the density by a carefully chosen family of basis functions. We show that the method achieves a fast convergence rate in probability with respect to the Kullback-Leibler divergence for densities whose logarithm belongs to a Sobolev function class with known regularity. In the case when the regularity is unknown, we propose an estimation procedure using convex aggregation of the log-densities to obtain adaptability. The performance of this method is illustrated in a simulation study. △ Less

Submitted 21 April, 2016; originally announced April 2016.

Comments: 34 pages, 7 figures

MSC Class: 62G07; 62G05; 62G20

arXiv:1602.04310 [pdf, other]

Adaptive test for large covariance matrices with missing observations

Authors: Cristina Butucea, Rania Zgheib

Abstract: We observe $n$ independent $p-$dimensional Gaussian vectors with missing coordinates, that is each value (which is assumed standardized) is observed with probability $a>0$. We investigate the problem of minimax nonparametric testing that the high-dimensional covariance matrix $Σ$ of the underlying Gaussian distribution is the identity matrix, using these partially observed vectors. Here, $n$ and… ▽ More We observe $n$ independent $p-$dimensional Gaussian vectors with missing coordinates, that is each value (which is assumed standardized) is observed with probability $a>0$. We investigate the problem of minimax nonparametric testing that the high-dimensional covariance matrix $Σ$ of the underlying Gaussian distribution is the identity matrix, using these partially observed vectors. Here, $n$ and $p$ tend to infinity and $a>0$ tends to 0, asymptotically. We assume that $Σ$ belongs to a Sobolev-type ellipsoid with parameter $α>0$. When $α$ is known, we give asymptotically minimax consistent test procedure and find the minimax separation rates $\tilde \varphi_{n,p}= (a^2n \sqrt{p})^{- \frac{2 α}{4 α+1}}$, under some additional constraints on $n,\, p$ and $a$. We show that, in the particular case of Toeplitz covariance matrices,the minimax separation rates are faster, $\tilde φ_{n,p}= (a^2n p)^{- \frac{2 α}{4 α+1}}$. We note how the "missingness" parameter $a$ deteriorates the rates with respect to the case of fully observed vectors ($a=1$). We also propose adaptive test procedures, that is free of the parameter $α$ in some interval, and show that the loss of rate is $(\ln \ln (a^2 n\sqrt{p}))^{α/(4 α+1)}$ and $(\ln \ln (a^2 n p))^{α/(4 α+1)}$ for Toeplitz covariance matrices, respectively. △ Less

Submitted 13 February, 2016; originally announced February 2016.

MSC Class: 62G10; 62H15

arXiv:1601.05686 [pdf, ps, other]

Optimal exponential bounds for aggregation of estimators for the Kullback-Leibler loss

Authors: Cristina Butucea, Jean-François Delmas, Anne Dutfoy, Richard Fischer

Abstract: We study the problem of model selection type aggregation with respect to the Kullback-Leibler divergence for various probabilistic models. Rather than considering a convex combination of the initial estimators $f_1, \ldots, f_N$, our aggregation procedures rely on the convex combination of the logarithms of these functions. The first method is designed for probability density estimation as it give… ▽ More We study the problem of model selection type aggregation with respect to the Kullback-Leibler divergence for various probabilistic models. Rather than considering a convex combination of the initial estimators $f_1, \ldots, f_N$, our aggregation procedures rely on the convex combination of the logarithms of these functions. The first method is designed for probability density estimation as it gives an aggregate estimator that is also a proper density function, whereas the second method concerns spectral density estimation and has no such mass-conserving feature. We select the aggregation weights based on a penalized maximum likelihood criterion. We give sharp oracle inequalities that hold with high probability, with a remainder term that is decomposed into a bias and a variance part. We also show the optimality of the remainder terms by providing the corresponding lower bound results. △ Less

Submitted 21 January, 2016; originally announced January 2016.

Comments: 25 pages

MSC Class: 62G07; 62G05; 62M15

arXiv:1512.01832 [pdf, ps, other]

Variable selection with Hamming loss

Authors: Cristina Butucea, Mohamed Ndaoud, Natalia A. Stepanova, Alexandre B. Tsybakov

Abstract: We derive non-asymptotic bounds for the minimax risk of variable selection under expected Hamming loss in the Gaussian mean model in $\mathbb{R}^d$ for classes of $s$-sparse vectors separated from 0 by a constant $a > 0$. In some cases, we get exact expressions for the nonasymptotic minimax risk as a function of $d, s, a$ and find explicitly the minimax selectors. These results are extended to dep… ▽ More We derive non-asymptotic bounds for the minimax risk of variable selection under expected Hamming loss in the Gaussian mean model in $\mathbb{R}^d$ for classes of $s$-sparse vectors separated from 0 by a constant $a > 0$. In some cases, we get exact expressions for the nonasymptotic minimax risk as a function of $d, s, a$ and find explicitly the minimax selectors. These results are extended to dependent or non-Gaussian observations and to the problem of crowdsourcing. Analogous conclusions are obtained for the probability of wrong recovery of the sparsity pattern. As corollaries, we derive necessary and sufficient conditions for such asymptotic properties as almost full recovery and exact recovery. Moreover, we propose data-driven selectors that provide almost full and exact recovery adaptively to the parameters of the classes. △ Less

Submitted 12 October, 2018; v1 submitted 6 December, 2015; originally announced December 2015.

arXiv:1509.02019 [pdf, ps, other]

Maximum entropy distribution of order statistics with given marginals

Authors: Cristina Butucea, Jean-François Delmas, Anne Dutfoy, Richard Fischer

Abstract: We consider distributions of ordered random vectors with given one-dimensional marginal distributions. We give an elementary necessary and sufficient condition for the existence of such a distribution with finite entropy. In this case, we give explicitly the density of the unique distribution which achieves the maximal entropy and compute the value of its entropy. This density is the unique one wh… ▽ More We consider distributions of ordered random vectors with given one-dimensional marginal distributions. We give an elementary necessary and sufficient condition for the existence of such a distribution with finite entropy. In this case, we give explicitly the density of the unique distribution which achieves the maximal entropy and compute the value of its entropy. This density is the unique one which has a product form on its support and the given one-dimensional marginals. The proof relies on the study of copulas with given one-dimensional marginal distributions for its order statistics. △ Less

Submitted 7 September, 2015; originally announced September 2015.

Comments: 35 pages, overview of the notations at page 33

MSC Class: 62H05; 60E15; 62G30; 94A17

arXiv:1508.06660 [pdf, ps, other]

Adaptive variable selection in nonparametric sparse additive models

Authors: Cristina Butucea, Natalia Stepanova

Abstract: We consider the problem of recovery of an unknown multivariate signal $f$ observed in a $d$-dimensional Gaussian white noise model of intensity $\varepsilon$. We assume that $f$ belongs to a class of smooth functions ${\cal F}^d\subset L_2([0,1]^d)$ and has an additive sparse structure determined by the parameter $s$, the number of non-zero univariate components contributing to $f$. We are interes… ▽ More We consider the problem of recovery of an unknown multivariate signal $f$ observed in a $d$-dimensional Gaussian white noise model of intensity $\varepsilon$. We assume that $f$ belongs to a class of smooth functions ${\cal F}^d\subset L_2([0,1]^d)$ and has an additive sparse structure determined by the parameter $s$, the number of non-zero univariate components contributing to $f$. We are interested in the case when $d=d_\varepsilon \to \infty$ as $\varepsilon \to 0$ and the parameter $s$ stays "small" relative to $d$. With these assumptions, the recovery problem in hand becomes that of determining which sparse additive components are non-zero. Attempting to reconstruct most non-zero components of $f$, but not all of them, we arrive at the problem of almost full variable selection in high-dimensional regression. For two different choices of ${\cal F}^d$, we establish conditions under which almost full variable selection is possible, and provide a procedure that gives almost full variable selection. The procedure does the best (in the asymptotically minimax sense) in selecting most non-zero components of $f$. Moreover, it is adaptive in the parameter $s$. △ Less

Submitted 26 August, 2015; originally announced August 2015.

MSC Class: 62G08; 62G20

arXiv:1506.01557 [pdf, other]

Sharp minimax tests for large Toeplitz covariance matrices with repeated observations

Authors: Cristina Butucea, Rania Zgheib

Abstract: We observe a sample of $n$ independent $p$-dimensional Gaussian vectors with Toeplitz covariance matrix $ Σ= [σ_{|i-j|}]_{1 \leq i,j \leq p}$ and $σ_0=1$. We consider the problem of testing the hypothesis that $Σ$ is the identity matrix asymptotically when $n \to \infty$ and $p \to \infty$. We suppose that the covariances $σ_k$ decrease either polynomially ($\sum_{k \geq 1} k^{2α} σ^2_{k} \leq L$… ▽ More We observe a sample of $n$ independent $p$-dimensional Gaussian vectors with Toeplitz covariance matrix $ Σ= [σ_{|i-j|}]_{1 \leq i,j \leq p}$ and $σ_0=1$. We consider the problem of testing the hypothesis that $Σ$ is the identity matrix asymptotically when $n \to \infty$ and $p \to \infty$. We suppose that the covariances $σ_k$ decrease either polynomially ($\sum_{k \geq 1} k^{2α} σ^2_{k} \leq L$ for $ α>1/4$ and $L>0$) or exponentially ($\sum_{k \geq 1} e^{2Ak} σ^2_{k} \leq L$ for $ A,L>0$). We consider a test procedure based on a weighted U-statistic of order 2, with optimal weights chosen as solution of an extremal problem. We give the asymptotic normality of the test statistic under the null hypothesis for fixed $n$ and $p \to + \infty$ and the asymptotic behavior of the type I error probability of our test procedure. We also show that the maximal type II error probability, either tend to $0$, or is bounded from above. In the latter case, the upper bound is given using the asymptotic normality of our test statistic under alternatives close to the separation boundary. Our assumptions imply mild conditions: $n=o(p^{2α- 1/2})$ (in the polynomial case), $n=o(e^p)$ (in the exponential case). We prove both rate optimality and sharp optimality of our results, for $α>1$ in the polynomial case and for any $A>0$ in the exponential case. A simulation study illustrates the good behavior of our procedure, in particular for small $n$, large $p$. △ Less

Submitted 4 June, 2015; originally announced June 2015.

arXiv:1504.08295 [pdf, other]

doi 10.1088/1367-2630/17/11/113050

Spectral thresholding quantum tomography for low rank states

Authors: Cristina Butucea, Madalin Guta, Theodore Kypraios

Abstract: The estimation of high dimensional quantum states is an important statistical problem arising in current quantum technology applications. A key example is the tomography of multiple ions states, employed in the validation of state preparation in ion trap experiments \cite{Haffner2005}. Since full tomography becomes unfeasible even for a small number of ions, there is a need to investigate lower di… ▽ More The estimation of high dimensional quantum states is an important statistical problem arising in current quantum technology applications. A key example is the tomography of multiple ions states, employed in the validation of state preparation in ion trap experiments \cite{Haffner2005}. Since full tomography becomes unfeasible even for a small number of ions, there is a need to investigate lower dimensional statistical models which capture prior information about the state, and to devise estimation methods tailored to such models. In this paper we propose several new methods aimed at the efficient estimation of low rank states in multiple ions tomography. All methods consist in first computing the least squares estimator, followed by its truncation to an appropriately chosen smaller rank. The latter is done by setting eigenvalues below a certain "noise level" to zero, while keeping the rest unchanged, or normalising them appropriately. We show that (up to logarithmic factors in the space dimension) the mean square error of the resulting estimators scales as $r\cdot d/N$ where $r$ is the rank, $d=2^k$ is the dimension of the Hilbert space, and $N$ is the number of quantum samples. Furthermore we establish a lower bound for the asymptotic minimax risk which shows that the above scaling is optimal. The performance of the estimators is analysed in an extensive simulations study, with emphasis on the dependence on the state rank, and the number of measurement repetitions. We find that all estimators perform significantly better that the least squares, with the "physical estimator" (which is a bona fide density matrix) slightly outperforming the other estimators. △ Less

Submitted 30 April, 2015; originally announced April 2015.

Comments: 35pages, 19 figures

arXiv:1409.1429 [pdf, other]

Sharp minimax tests for large covariance matrices and adaptation

Authors: Cristina Butucea, Rania Zgheib

Abstract: We consider the detection problem of correlations in a $p$-dimensional Gaussian vector, when we observe $n$ independent, identically distributed random vectors, for $n$ and $p$ large. We assume that the covariance matrix varies in some ellipsoid with parameter $α>1/2$ and total energy bounded by $L>0$. We propose a test procedure based on a U-statistic of order 2 which is weighted in an optimal wa… ▽ More We consider the detection problem of correlations in a $p$-dimensional Gaussian vector, when we observe $n$ independent, identically distributed random vectors, for $n$ and $p$ large. We assume that the covariance matrix varies in some ellipsoid with parameter $α>1/2$ and total energy bounded by $L>0$. We propose a test procedure based on a U-statistic of order 2 which is weighted in an optimal way. The weights are the solution of an optimization problem, they are constant on each diagonal and non-null only for the $T$ first diagonals, where $T=o(p)$. We show that this test statistic is asymptotically Gaussian distributed under the null hypothesis and also under the alternative hypothesis for matrices close to the detection boundary. We prove upper bounds for the total error probability of our test procedure, for $α>1/2$ and under the assumption $T=o(p)$ which implies that $n=o(p^{ 2 α})$. We illustrate via a numerical study the behavior of our test procedure. Moreover, we prove lower bounds for the maximal type II error and the total error probabilities. Thus we obtain the asymptotic and the sharp asymptotically minimax separation rate $\tilde{\varphi} = (C(α, L) n^2 p )^{- α/(4 α+ 1)}$, for $α>3/2$ and for $α>1$ together with the additional assumption $p= o(n^{4 α-1})$, respectively. We deduce rate asymptotic minimax results for testing the inverse of the covariance matrix. We construct an adaptive test procedure with respect to the parameter $α$ and show that it attains the rate $\tildeψ= ( n^2 p / \ln\ln(n \displaystyle\sqrt{p}) )^{- α/(4 α+ 1)}$. △ Less

Submitted 26 January, 2016; v1 submitted 4 September, 2014; originally announced September 2014.

MSC Class: 62G10; 62H15; 62G20

arXiv:1402.2243 [pdf, other]

Semiparametric topographical mixture models with symmetric errors

Authors: Cristina Butucea, Rodrigue Ngueyep Tzoumpe, Pierre Vandekerkhove

Abstract: Motivated by the analysis of a Positron Emission Tomography (PET) imaging data considered in Bowen et al. (2012), we introduce a semiparametric topographical mixture model able to capture the characteristics of dichotomous shifted response-type experiments. We propose a local estimation procedure, based on the symmetry of the local noise, for the proportion and locations functions involved in the… ▽ More Motivated by the analysis of a Positron Emission Tomography (PET) imaging data considered in Bowen et al. (2012), we introduce a semiparametric topographical mixture model able to capture the characteristics of dichotomous shifted response-type experiments. We propose a local estimation procedure, based on the symmetry of the local noise, for the proportion and locations functions involved in the proposed model. We establish under mild conditions the minimax properties and asymptotic normality of our estimators when Monte Carlo simulations are conducted to examine their finite sample performance. Finally a statistical analysis of the PET imaging data in Bowen et al. (2012) is illustrated for the proposed method. △ Less

Submitted 10 February, 2014; originally announced February 2014.

Comments: 19 figures

MSC Class: Primary 62G05; 62G20; secondary 62E10

arXiv:1312.5219 [pdf, ps, other]

Maximum entropy copula with given diagonal section

Authors: Cristina Butucea, Jean-François Delmas, Anne Dutfoy, Richard Fischer

Abstract: We consider copulas with a given diagonal section and compute the explicit density of the unique optimal copula which maximizes the entropy. In this sense, this copula is the least informative among the copulas with a given diagonal section. We give an explicit criterion on the diagonal section for the existence of the optimal copula and give a closed formula for its entropy. We also provide examp… ▽ More We consider copulas with a given diagonal section and compute the explicit density of the unique optimal copula which maximizes the entropy. In this sense, this copula is the least informative among the copulas with a given diagonal section. We give an explicit criterion on the diagonal section for the existence of the optimal copula and give a closed formula for its entropy. We also provide examples for some diagonal sections of usual bivariate copulas and illustrate the differences between them and the maximum entropy copula with the same diagonal section. △ Less

Submitted 18 December, 2013; originally announced December 2013.

Comments: 25 pages, 7 figures

MSC Class: 62H05; 60E05

arXiv:1303.5647 [pdf, ps, other]

Sharp Variable Selection of a Sparse Submatrix in a High-Dimensional Noisy Matrix

Authors: Cristina Butucea, Yuri I. Ingster, Irina Suslina

Abstract: We observe a $N\times M$ matrix of independent, identically distributed Gaussian random variables which are centered except for elements of some submatrix of size $n\times m$ where the mean is larger than some $a>0$. The submatrix is sparse in the sense that $n/N$ and $m/M$ tend to 0, whereas $n,\, m, \, N$ and $M$ tend to infinity. We consider the problem of selecting the random variables with… ▽ More We observe a $N\times M$ matrix of independent, identically distributed Gaussian random variables which are centered except for elements of some submatrix of size $n\times m$ where the mean is larger than some $a>0$. The submatrix is sparse in the sense that $n/N$ and $m/M$ tend to 0, whereas $n,\, m, \, N$ and $M$ tend to infinity. We consider the problem of selecting the random variables with significantly large mean values. We give sufficient conditions on $a$ as a function of $n,\, m,\,N$ and $M$ and construct a uniformly consistent procedure in order to do sharp variable selection. We also prove the minimax lower bounds under necessary conditions which are complementary to the previous conditions. The critical values $a^*$ separating the necessary and sufficient conditions are sharp (we show exact constants). We note a gap between the critical values $a^*$ for selection of variables and that of detecting that such a submatrix exists given by Butucea and Ingster (2012). When $a^*$ is in this gap, consistent detection is possible but no consistent selector of the corresponding variables can be found. △ Less

Submitted 22 March, 2013; originally announced March 2013.

MSC Class: 62C20; 62G05; 62G20

arXiv:1301.4660 [pdf, ps, other]

Sharp detection of smooth signals in a high-dimensional sparse matrix with indirect observations

Authors: Cristina Butucea, Ghislaine Gayraud

Abstract: We consider a matrix-valued Gaussian sequence model, that is, we observe a sequence of high-dimensional $M \times N$ matrices of heterogeneous Gaussian random variables $x_{ij,k}$ for $i \in\{1,...,M\}$, $j \in \{1,...,N\}$ and $k \in \mathbb{Z}$. The standard deviation of our observations is $\ep k^s$ for some $\ep >0$ and $s \geq 0$. We give sharp rates for the detection of a sparse submatrix… ▽ More We consider a matrix-valued Gaussian sequence model, that is, we observe a sequence of high-dimensional $M \times N$ matrices of heterogeneous Gaussian random variables $x_{ij,k}$ for $i \in\{1,...,M\}$, $j \in \{1,...,N\}$ and $k \in \mathbb{Z}$. The standard deviation of our observations is $\ep k^s$ for some $\ep >0$ and $s \geq 0$. We give sharp rates for the detection of a sparse submatrix of size $m \times n$ with active components. A component $(i,j)$ is said active if the sequence $\{x_{ij,k}\}_k$ have mean $\{θ_{ij,k}\}_k$ within a Sobolev ellipsoid of smoothness $τ>0$ and total energy $\sum_k θ^2_{ij,k} $ larger than some $r^2_\ep$. Our rates involve relationships between $m,\, n, \, M$ and $N$ tending to infinity such that $m/M$, $n/N$ and $\ep$ tend to 0, such that a test procedure that we construct has asymptotic minimax risk tending to 0. We prove corresponding lower bounds under additional assumptions on the relative size of the submatrix in the large matrix of observations. Except for these additional conditions our rates are asymptotically sharp. Lower bounds for hypothesis testing problems mean that no test procedure can distinguish between the null hypothesis (no signal) and the alternative, i.e. the minimax risk for testing tends to 1. △ Less

Submitted 20 January, 2013; originally announced January 2013.

MSC Class: 62H15; 60G15; 62G10; 62G20; 60C20

arXiv:1206.1711 [pdf, ps, other]

doi 10.1103/PhysRevA.88.032113

Rank penalized estimation of a quantum system

Authors: Pierre Alquier, Cristina Butucea, Mohamed Hebiri, Katia Meziani, Morimae Tomoyuki

Abstract: We introduce a new method to reconstruct the density matrix $ρ$ of a system of $n$-qubits and estimate its rank $d$ from data obtained by quantum state tomography measurements repeated $m$ times. The procedure consists in minimizing the risk of a linear estimator $\hatρ$ of $ρ$ penalized by given rank (from 1 to $2^n$), where $\hatρ$ is previously obtained by the moment method. We obtain simultane… ▽ More We introduce a new method to reconstruct the density matrix $ρ$ of a system of $n$-qubits and estimate its rank $d$ from data obtained by quantum state tomography measurements repeated $m$ times. The procedure consists in minimizing the risk of a linear estimator $\hatρ$ of $ρ$ penalized by given rank (from 1 to $2^n$), where $\hatρ$ is previously obtained by the moment method. We obtain simultaneously an estimator of the rank and the resulting density matrix associated to this rank. We establish an upper bound for the error of penalized estimator, evaluated with the Frobenius norm, which is of order $dn(4/3)^n /m$ and consistency for the estimator of the rank. The proposed methodology is computationaly efficient and is illustrated with some example states and real experimental data sets. △ Less

Submitted 26 September, 2013; v1 submitted 8 June, 2012; originally announced June 2012.

arXiv:1111.2247 [pdf, other]

Semiparametric mixtures of symmetric distributions

Authors: Cristina Butucea, Pierre Vandekerkhove

Abstract: We consider in this paper the semiparametric mixture of two distributions equal up to a shift parameter. The model is said to be semiparametric in the sense that the mixed distribution is not supposed to belong to a parametric family. In order to insure the identifiability of the model it is assumed that the mixed distribution is symmetric, the model being then defined by the mixing proportion, tw… ▽ More We consider in this paper the semiparametric mixture of two distributions equal up to a shift parameter. The model is said to be semiparametric in the sense that the mixed distribution is not supposed to belong to a parametric family. In order to insure the identifiability of the model it is assumed that the mixed distribution is symmetric, the model being then defined by the mixing proportion, two location parameters, and the probability density function of the mixed distribution. We propose a new class of M-estimators of these parameters based on a Fourier approach, and prove that they are square root consistent under mild regularity conditions. Their finite-sample properties are illustrated by a Monte Carlo study and a benchmark real dataset is also studied with our method. △ Less

Submitted 9 November, 2011; originally announced November 2011.

arXiv:1109.0898 [pdf, ps, other]

doi 10.3150/12-BEJ470

Detection of a sparse submatrix of a high-dimensional noisy matrix

Authors: Cristina Butucea, Yuri I. Ingster

Abstract: We observe a $N\times M$ matrix $Y_{ij}=s_{ij}+ξ_{ij}$ with $ξ_{ij}\sim {\mathcal {N}}(0,1)$ i.i.d. in $i,j$, and $s_{ij}\in \mathbb {R}$. We test the null hypothesis $s_{ij}=0$ for all $i,j$ against the alternative that there exists some submatrix of size $n\times m$ with significant elements in the sense that $s_{ij}\ge a>0$. We propose a test procedure and compute the asymptotical detection bou… ▽ More We observe a $N\times M$ matrix $Y_{ij}=s_{ij}+ξ_{ij}$ with $ξ_{ij}\sim {\mathcal {N}}(0,1)$ i.i.d. in $i,j$, and $s_{ij}\in \mathbb {R}$. We test the null hypothesis $s_{ij}=0$ for all $i,j$ against the alternative that there exists some submatrix of size $n\times m$ with significant elements in the sense that $s_{ij}\ge a>0$. We propose a test procedure and compute the asymptotical detection boundary $a$ so that the maximal testing risk tends to 0 as $M\to\infty$, $N\to\infty$, $p=n/N\to0$, $q=m/M\to0$. We prove that this boundary is asymptotically sharp minimax under some additional constraints. Relations with other testing problems are discussed. We propose a testing procedure which adapts to unknown $(n,m)$ within some given set and compute the adaptive sharp rates. The implementation of our test procedure on synthetic data shows excellent behavior for sparse, not necessarily squared matrices. We extend our sharp minimax results in different directions: first, to Gaussian matrices with unknown variance, next, to matrices of random variables having a distribution from an exponential family (non-Gaussian) and, finally, to a two-sided alternative for matrices with Gaussian elements. △ Less

Submitted 19 December, 2013; v1 submitted 5 September, 2011; originally announced September 2011.

Comments: Published in at http://dx.doi.org/10.3150/12-BEJ470 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)

Report number: IMS-BEJ-BEJ470

Journal ref: Bernoulli 2013, Vol. 19, No. 5B, 2652-2688

arXiv:1004.2452 [pdf, ps, other]

doi 10.1063/1.3476776

Quantum U-statistics

Authors: Madalin Guta, Cristina Butucea

Abstract: The notion of a $U$-statistic for an $n$-tuple of identical quantum systems is introduced in analogy to the classical (commutative) case: given a selfadjoint `kernel' $K$ acting on $(\mathbb{C}^{d})^{\otimes r}$ with $r<n$, we define the symmetric operator $U_{n}= {n \choose r} \sum_βK^{(β)}$ with $K^{(β)}$ being the kernel acting on the subset $β$ of $\{1,\dots ,n\}$. If the systems are prepared… ▽ More The notion of a $U$-statistic for an $n$-tuple of identical quantum systems is introduced in analogy to the classical (commutative) case: given a selfadjoint `kernel' $K$ acting on $(\mathbb{C}^{d})^{\otimes r}$ with $r<n$, we define the symmetric operator $U_{n}= {n \choose r} \sum_βK^{(β)}$ with $K^{(β)}$ being the kernel acting on the subset $β$ of $\{1,\dots ,n\}$. If the systems are prepared in the i.i.d state $ρ^{\otimes n}$ it is shown that the sequence of properly normalised $U$-statistics converges in moments to a linear combination of Hermite polynomials in canonical variables of a CCR algebra defined through the Quantum Central Limit Theorem. In the special cases of non-degenerate kernels and kernels of order $2$ it is shown that the convergence holds in the stronger distribution sense. Two types of applications in quantum statistics are described: testing beyond the two simple hypotheses scenario, and quantum metrology with interacting hamiltonians. △ Less

Submitted 4 May, 2010; v1 submitted 14 April, 2010; originally announced April 2010.

Comments: 30 pages, added section on quantum metrology

Journal ref: J. Math. Phys. 51, 102202 (2010)

arXiv:0902.2309 [pdf, ps, other]

Quadratic functional estimation in inverse problems

Authors: Cristina Butucea, Katia Méziani

Abstract: We consider in this paper a Gaussian sequence model of observations $Y_i$, $i\geq 1$ having mean (or signal) $θ_i$ and variance $σ_i$ which is growing polynomially like $i^γ$, $γ>0$. This model describes a large panel of inverse problems. We estimate the quadratic functional of the unknown signal $\sum_{i\geq 1}θ_i^2$ when the signal belongs to ellipsoids of both finite smoothness functions (pol… ▽ More We consider in this paper a Gaussian sequence model of observations $Y_i$, $i\geq 1$ having mean (or signal) $θ_i$ and variance $σ_i$ which is growing polynomially like $i^γ$, $γ>0$. This model describes a large panel of inverse problems. We estimate the quadratic functional of the unknown signal $\sum_{i\geq 1}θ_i^2$ when the signal belongs to ellipsoids of both finite smoothness functions (polynomial weights $i^α$, $α>0$) and infinite smoothness (exponential weights $e^{βi^r}$, $β>0$, $0<r \leq 2$). We propose a Pinsker type projection estimator in each case and study its quadratic risk. When the signal is sufficiently smoother than the difficulty of the inverse problem ($α>γ+1/4$ or in the case of exponential weights), we obtain the parametric rate and the efficiency constant associated to it. Moreover, we give upper bounds of the second order term in the risk and conjecture that they are asymptotically sharp minimax. When the signal is finitely smooth with $α\leq γ+1/4$, we compute non parametric upper bounds of the risk of and we presume also that the constant is asymptotically sharp. △ Less

Submitted 13 February, 2009; originally announced February 2009.

MSC Class: 62F12; 62G05; 62G10; 62G20

arXiv:0902.1443 [pdf, ps, other]

doi 10.3150/08-BEJ146

Adaptive estimation of linear functionals in the convolution model and applications

Authors: C. Butucea, F. Comte

Abstract: We consider the model $Z_i=X_i+\varepsilon_i$, for i.i.d. $X_i$'s and $\varepsilon_i$'s and independent sequences $(X_i)_{i\in{\mathbb{N}}}$ and $(\varepsilon_i)_{i\in{\mathbb{N}}}$. The density $f_{\varepsilon}$ of $\varepsilon_1$ is assumed to be known, whereas the one of $X_1$, denoted by $g$, is unknown. Our aim is to estimate linear functionals of $g$, $<ψ,g>$ for a known function $ψ$. We p… ▽ More We consider the model $Z_i=X_i+\varepsilon_i$, for i.i.d. $X_i$'s and $\varepsilon_i$'s and independent sequences $(X_i)_{i\in{\mathbb{N}}}$ and $(\varepsilon_i)_{i\in{\mathbb{N}}}$. The density $f_{\varepsilon}$ of $\varepsilon_1$ is assumed to be known, whereas the one of $X_1$, denoted by $g$, is unknown. Our aim is to estimate linear functionals of $g$, $<ψ,g>$ for a known function $ψ$. We propose a general estimator of $<ψ,g>$ and study the rate of convergence of its quadratic risk as a function of the smoothness of $g$, $f_{\varepsilon}$ and $ψ$. Different contexts with dependent data, such as stochastic volatility and AutoRegressive Conditionally Heteroskedastic models, are also considered. An estimator which is adaptive to the smoothness of unknown $g$ is then proposed, following a method studied by Laurent et al. (Preprint (2006)) in the Gaussian white noise model. We give upper bounds and asymptotic lower bounds of the quadratic risk of this estimator. The results are applied to adaptive pointwise deconvolution, in which context losses in the adaptive rates are shown to be optimal in the minimax sense. They are also applied in the context of the stochastic volatility model. △ Less

Submitted 9 February, 2009; originally announced February 2009.

Comments: Published in at http://dx.doi.org/10.3150/08-BEJ146 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)

Report number: IMS-BEJ-BEJ146

Journal ref: Bernoulli 2009, Vol. 15, No. 1, 69-98

arXiv:0804.2434 [pdf, ps, other]

doi 10.1088/0266-5611/25/1/015003

State estimation in quantum homodyne tomography with noisy data

Authors: Jean-Marie Aubry, Cristina Butucea, Katia Méziani

Abstract: In the framework of noisy quantum homodyne tomography with efficiency parameter $0 < η\leq 1$, we propose two estimators of a quantum state whose density matrix elements $ρ_{m,n}$ decrease like $e^{-B(m+n)^{r/ 2}}$, for fixed known $B>0$ and $0<r\leq 2$. The first procedure estimates the matrix coefficients by a projection method on the pattern functions (that we introduce here for… ▽ More In the framework of noisy quantum homodyne tomography with efficiency parameter $0 < η\leq 1$, we propose two estimators of a quantum state whose density matrix elements $ρ_{m,n}$ decrease like $e^{-B(m+n)^{r/ 2}}$, for fixed known $B>0$ and $0<r\leq 2$. The first procedure estimates the matrix coefficients by a projection method on the pattern functions (that we introduce here for $0<η\leq 1/2$), the second procedure is a kernel estimator of the associated Wigner function. We compute the convergence rates of these estimators, in $\mathbb{L}_2$ risk. △ Less

Submitted 26 July, 2008; v1 submitted 15 April, 2008; originally announced April 2008.

MSC Class: 62G05; 62G20; 81V80

arXiv:0804.1056 [pdf, ps, other]

doi 10.1214/08-EJS225

Adaptivity in convolution models with partially known noise distribution

Authors: Cristina Butucea, Catherine Matias, Christophe Pouet

Abstract: We consider a semiparametric convolution model. We observe random variables having a distribution given by the convolution of some unknown density $f$ and some partially known noise density $g$. In this work, $g$ is assumed exponentially smooth with stable law having unknown self-similarity index $s$. In order to ensure identifiability of the model, we restrict our attention to polynomially smoo… ▽ More We consider a semiparametric convolution model. We observe random variables having a distribution given by the convolution of some unknown density $f$ and some partially known noise density $g$. In this work, $g$ is assumed exponentially smooth with stable law having unknown self-similarity index $s$. In order to ensure identifiability of the model, we restrict our attention to polynomially smooth, Sobolev-type densities $f$, with smoothness parameter $β$. In this context, we first provide a consistent estimation procedure for $s$. This estimator is then plugged-into three different procedures: estimation of the unknown density $f$, of the functional $\int f^2$ and goodness-of-fit test of the hypothesis $H_0:f=f_0$, where the alternative $H_1$ is expressed with respect to $\mathbb {L}_2$-norm (i.e. has the form $ψ_n^{-2}\|f-f_0\|_2^2\ge \mathcal{C}$). These procedures are adaptive with respect to both $s$ and $β$ and attain the rates which are known optimal for known values of $s$ and $β$. As a by-product, when the noise density is known and exponentially smooth our testing procedure is optimal adaptive for testing Sobolev-type densities. The estimating procedure of $s$ is illustrated on synthetic data. △ Less

Submitted 3 October, 2008; v1 submitted 7 April, 2008; originally announced April 2008.

Comments: Published in at http://dx.doi.org/10.1214/08-EJS225 the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-EJS-EJS_2008_225 MSC Class: 62F12; 62G05 (Primary) 62G10; 62G20 (Secondary)

Journal ref: Electronic Journal of Statistics 2008, Vol. 2, 897-915

arXiv:0711.0807 [pdf, ps, other]

doi 10.1214/07-EJS079

Functional approach for excess mass estimation in the density model

Authors: Cristina Butucea, Mathilde Mougeot, Karine Tribouley

Abstract: We consider a multivariate density model where we estimate the excess mass of the unknown probability density $f$ at a given level $ν>0$ from $n$ i.i.d. observed random variables. This problem has several applications such as multimodality testing, density contour clustering, anomaly detection, classification and so on. For the first time in the literature we estimate the excess mass as an integ… ▽ More We consider a multivariate density model where we estimate the excess mass of the unknown probability density $f$ at a given level $ν>0$ from $n$ i.i.d. observed random variables. This problem has several applications such as multimodality testing, density contour clustering, anomaly detection, classification and so on. For the first time in the literature we estimate the excess mass as an integrated functional of the unknown density $f$. We suggest an estimator and evaluate its rate of convergence, when $f$ belongs to general Besov smoothness classes, for several risk measures. A particular care is devoted to implementation and numerical study of the studied procedure. It appears that our procedure improves the plug-in estimator of the excess mass. △ Less

Submitted 6 November, 2007; originally announced November 2007.

Comments: Published in at http://dx.doi.org/10.1214/07-EJS079 the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-EJS-EJS_2007_79 MSC Class: 62G05; 62G20; 62H12 (Primary); 62C20 (Secondary)

Journal ref: Electronic Journal of Statistics 2007, Vol. 1, 449-472

arXiv:math/0701290 [pdf, ps, other]

Adaptive procedures in convolution models with known or partially known noise distribution

Authors: Cristina Butucea, Catherine Matias, Christophe Pouet

Abstract: In a convolution model, we observe random variables whose distribution is the convolution of some unknown density f and some known or partially known noise density g. In this paper, we focus on statistical procedures, which are adaptive with respect to the smoothness parameter tau of unknown density f, and also (in some cases) to some unknown parameter of the noise density g. In a first part, we… ▽ More In a convolution model, we observe random variables whose distribution is the convolution of some unknown density f and some known or partially known noise density g. In this paper, we focus on statistical procedures, which are adaptive with respect to the smoothness parameter tau of unknown density f, and also (in some cases) to some unknown parameter of the noise density g. In a first part, we assume that g is known and polynomially smooth. We provide goodness-of-fit procedures for the test H_0:f=f_0, where the alternative H_1 is expressed with respect to L_2-norm. Our adaptive (w.r.t tau) procedure behaves differently according to whether f_0 is polynomially or exponentially smooth. A payment for adaptation is noted in both cases and for computing this, we provide a non-uniform Berry-Esseen type theorem for degenerate U-statistics. In the first case we prove that the payment for adaptation is optimal (thus unavoidable). In a second part, we study a wider framework: a semiparametric model, where g is exponentially smooth and stable, and its self-similarity index s is unknown. In order to ensure identifiability, we restrict our attention to polynomially smooth, Sobolev-type densities f. In this context, we provide a consistent estimation procedure for s. This estimator is then plugged-into three different procedures: estimation of the unknown density f, of the functional \int f^2 and test of the hypothesis H_0. These procedures are adaptive with respect to both s and tau and attain the rates which are known optimal for known values of s and tau. As a by-product, when the noise is known and exponentially smooth our testing procedure is adaptive for testing Sobolev-type densities. △ Less

Submitted 22 March, 2007; v1 submitted 10 January, 2007; originally announced January 2007.

Comments: 35 pages + annexe de 8 pages

MSC Class: 62F12; 62G05; 62G10; 62G20

arXiv:math/0612361 [pdf, ps, other]

doi 10.1214/009053607000000118

Goodness-of-fit testing and quadratic functional estimation from indirect observations

Authors: Cristina Butucea

Abstract: We consider the convolution model where i.i.d. random variables $X_i$ having unknown density $f$ are observed with additive i.i.d. noise, independent of the $X$'s. We assume that the density $f$ belongs to either a Sobolev class or a class of supersmooth functions. The noise distribution is known and its characteristic function decays either polynomially or exponentially asymptotically. We consi… ▽ More We consider the convolution model where i.i.d. random variables $X_i$ having unknown density $f$ are observed with additive i.i.d. noise, independent of the $X$'s. We assume that the density $f$ belongs to either a Sobolev class or a class of supersmooth functions. The noise distribution is known and its characteristic function decays either polynomially or exponentially asymptotically. We consider the problem of goodness-of-fit testing in the convolution model. We prove upper bounds for the risk of a test statistic derived from a kernel estimator of the quadratic functional $\int f^2$ based on indirect observations. When the unknown density is smoother enough than the noise density, we prove that this estimator is $n^{-1/2}$ consistent, asymptotically normal and efficient (for the variance we compute). Otherwise, we give nonparametric upper bounds for the risk of the same estimator. We give an approach unifying the proof of nonparametric minimax lower bounds for both problems. We establish them for Sobolev densities and for supersmooth densities less smooth than exponential noise. In the two setups we obtain exact testing constants associated with the asymptotic minimax rates. △ Less

Submitted 21 November, 2007; v1 submitted 13 December, 2006; originally announced December 2006.

Comments: Published in at http://dx.doi.org/10.1214/009053607000000118 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOS-AOS0244 MSC Class: 62F12; 62G05; 62G10; 62G20

Journal ref: Annals of Statistics 2007, Vol. 35, No. 5, 1907-1930

arXiv:math/0511105 [pdf, ps, other]

doi 10.1214/07-AIHP107

New $M$-estimators in semi-parametric regression with errors in variables

Authors: Cristina Butucea, Marie-Luce Taupin

Abstract: In the regression model with errors in variables, we observe $n$ i.i.d. copies of $(Y,Z)$ satisfying $Y=f_{θ^0}(X)+ξ$ and $Z=X+ε$ involving independent and unobserved random variables $X,ξ,ε$ plus a regression function $f_{θ^0}$, known up to a finite dimensional $θ^0$. The common densities of the $X_i$'s and of the $ξ_i$'s are unknown, whereas the distribution of $ε$ is completely known. We aim… ▽ More In the regression model with errors in variables, we observe $n$ i.i.d. copies of $(Y,Z)$ satisfying $Y=f_{θ^0}(X)+ξ$ and $Z=X+ε$ involving independent and unobserved random variables $X,ξ,ε$ plus a regression function $f_{θ^0}$, known up to a finite dimensional $θ^0$. The common densities of the $X_i$'s and of the $ξ_i$'s are unknown, whereas the distribution of $ε$ is completely known. We aim at estimating the parameter $θ^0$ by using the observations $(Y_1,Z_1),...,(Y_n,Z_n)$. We propose an estimation procedure based on the least square criterion $\tilde{S}_{θ^0,g}(θ)=\m athbb{E}_{θ^0,g}[((Y-f_θ(X))^2w(X)]$ where $w$ is a weight function to be chosen. We propose an estimator and derive an upper bound for its risk that depends on the smoothness of the errors density $p_ε$ and on the smoothness properties of $w(x)f_θ(x)$. Furthermore, we give sufficient conditions that ensure that the parametric rate of convergence is achieved. We provide practical recipes for the choice of $w$ in the case of nonlinear regression functions which are smooth on pieces allowing to gain in the order of the rate of convergence, up to the parametric rate in some cases. We also consider extensions of the estimation procedure, in particular, when a choice of $w_θ$ depending on $θ$ would be more appropriate. △ Less

Submitted 19 June, 2008; v1 submitted 4 November, 2005; originally announced November 2005.

Comments: Published in at http://dx.doi.org/10.1214/07-AIHP107 the Annales de l'Institut Henri Poincaré - Probabilités et Statistiques (http://www.imstat.org/aihp/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AIHP-AIHP107

Journal ref: Annales de l'Institut Henri Poincaré - Probabilités et Statistiques 2008, Vol. 44, No. 3, 393-421

arXiv:math/0504058 [pdf, ps, other]

doi 10.1214/009053606000001488

Minimax and adaptive estimation of the Wigner function in quantum homodyne tomography with noisy data

Authors: Cristina Butucea, Madalin Guţa, Luis Artiles

Abstract: We estimate the quantum state of a light beam from results of quantum homodyne measurements performed on identically prepared quantum systems. The state is represented through the Wigner function, a generalized probability density on $\mathbb{R}^2$ which may take negative values and must respect intrinsic positivity constraints imposed by quantum physics. The effect of the losses due to detectio… ▽ More We estimate the quantum state of a light beam from results of quantum homodyne measurements performed on identically prepared quantum systems. The state is represented through the Wigner function, a generalized probability density on $\mathbb{R}^2$ which may take negative values and must respect intrinsic positivity constraints imposed by quantum physics. The effect of the losses due to detection inefficiencies, which are always present in a real experiment, is the addition to the tomographic data of independent Gaussian noise. We construct a kernel estimator for the Wigner function, prove that it is minimax efficient for the pointwise risk over a class of infinitely differentiable functions, and implement it for numerical results. We construct adaptive estimators, that is, which do not depend on the smoothness parameters, and prove that in some setups they attain the minimax rates for the corresponding smoothness class. △ Less

Submitted 14 August, 2007; v1 submitted 4 April, 2005; originally announced April 2005.

Comments: Published at http://dx.doi.org/10.1214/009053606000001488 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOS-AOS0241 MSC Class: 62G05; 62G20; 81V80 (Primary)

Journal ref: Annals of Statistics 2007, Vol. 35, No. 2, 465-494

arXiv:math/0409471 [pdf, ps, other]

Sharp optimality for density deconvolution with dominating bias

Authors: Cristina Butucea, Alexandre B. Tsybakov

Abstract: We consider estimation of the common probability density $f$ of i.i.d. random variables $X_i$ that are observed with an additive i.i.d. noise. We assume that the unknown density $f$ belongs to a class $\mathcal{A}$ of densities whose characteristic function is described by the exponent $\exp(-α|u|^r)$ as $|u|\to \infty$, where $α>0$, $r>0$. The noise density is supposed to be known and such that… ▽ More We consider estimation of the common probability density $f$ of i.i.d. random variables $X_i$ that are observed with an additive i.i.d. noise. We assume that the unknown density $f$ belongs to a class $\mathcal{A}$ of densities whose characteristic function is described by the exponent $\exp(-α|u|^r)$ as $|u|\to \infty$, where $α>0$, $r>0$. The noise density is supposed to be known and such that its characteristic function decays as $\exp(-β|u|^s)$, as $|u| \to \infty$, where $β>0$, $s>0$. Assuming that $r<s$, we suggest a kernel type estimator that is optimal in sharp asymptotical minimax sense on $\mathcal{A}$ simultaneously under the pointwise and the $\mathbb{L}_2$-risks. The variance of the estimators turns out to be asymptotically negligible w.r.t. its squared bias. For $r<s/2$ we construct a sharp adaptive estimator of $f$. We discuss some effects of dominating bias, such as superefficiency of minimax estimators. △ Less

Submitted 24 September, 2004; originally announced September 2004.

MSC Class: 62G05; 62G20

Showing 1–44 of 44 results for author: Butucea, C