-
Revisit CP Tensor Decomposition: Statistical Optimality and Fast Convergence
Authors:
Runshi Tang,
Julien Chhor,
Olga Klopp,
Anru R. Zhang
Abstract:
Canonical Polyadic (CP) tensor decomposition is a fundamental technique for analyzing high-dimensional tensor data. While the Alternating Least Squares (ALS) algorithm is widely used for computing CP decomposition due to its simplicity and empirical success, its theoretical foundation, particularly regarding statistical optimality and convergence behavior, remain underdeveloped, especially in nois…
▽ More
Canonical Polyadic (CP) tensor decomposition is a fundamental technique for analyzing high-dimensional tensor data. While the Alternating Least Squares (ALS) algorithm is widely used for computing CP decomposition due to its simplicity and empirical success, its theoretical foundation, particularly regarding statistical optimality and convergence behavior, remain underdeveloped, especially in noisy, non-orthogonal, and higher-rank settings.
In this work, we revisit CP tensor decomposition from a statistical perspective and provide a comprehensive theoretical analysis of ALS under a signal-plus-noise model. We establish non-asymptotic, minimax-optimal error bounds for tensors of general order, dimensions, and rank, assuming suitable initialization. To enable such initialization, we propose Tucker-based Approximation with Simultaneous Diagonalization (TASD), a robust method that improves stability and accuracy in noisy regimes. Combined with ALS, TASD yields a statistically consistent estimator. We further analyze the convergence dynamics of ALS, identifying a two-phase pattern-initial quadratic convergence followed by linear refinement. We further show that in the rank-one setting, ALS with an appropriately chosen initialization attains optimal error within just one or two iterations.
△ Less
Submitted 28 May, 2025;
originally announced May 2025.
-
Optimal community detection in dense bipartite graphs
Authors:
Julien Chhor,
Parker Knight
Abstract:
We consider the problem of detecting a community of densely connected vertices in a high-dimensional bipartite graph of size $n_1 \times n_2$. Under the null hypothesis, the observed graph is drawn from a bipartite Erdős-Renyi distribution with connection probability $p_0$. Under the alternative hypothesis, there exists an unknown bipartite subgraph of size $k_1 \times k_2$ in which edges appear w…
▽ More
We consider the problem of detecting a community of densely connected vertices in a high-dimensional bipartite graph of size $n_1 \times n_2$. Under the null hypothesis, the observed graph is drawn from a bipartite Erdős-Renyi distribution with connection probability $p_0$. Under the alternative hypothesis, there exists an unknown bipartite subgraph of size $k_1 \times k_2$ in which edges appear with probability $p_1 = p_0 + δ$ for some $δ> 0$, while all other edges outside the subgraph appear with probability $p_0$. Specifically, we provide non-asymptotic upper and lower bounds on the smallest signal strength $δ^*$ that is both necessary and sufficient to ensure the existence of a test with small enough type one and type two errors. We also derive novel minimax-optimal tests achieving these fundamental limits when the underlying graph is sufficiently dense. Our proposed tests involve a combination of hard-thresholded nonlinear statistics of the adjacency matrix, the analysis of which may be of independent interest. In contrast with previous work, our non-asymptotic upper and lower bounds match for any configuration of $n_1,n_2, k_1,k_2$.
△ Less
Submitted 23 May, 2025;
originally announced May 2025.
-
On the Private Estimation of Smooth Transport Maps
Authors:
Clément Lalanne,
Franck Iutzeler,
Jean-Michel Loubes,
Julien Chhor
Abstract:
Estimating optimal transport maps between two distributions from respective samples is an important element for many machine learning methods. To do so, rather than extending discrete transport maps, it has been shown that estimating the Brenier potential of the transport problem and obtaining a transport map through its gradient is near minimax optimal for smooth problems. In this paper, we inves…
▽ More
Estimating optimal transport maps between two distributions from respective samples is an important element for many machine learning methods. To do so, rather than extending discrete transport maps, it has been shown that estimating the Brenier potential of the transport problem and obtaining a transport map through its gradient is near minimax optimal for smooth problems. In this paper, we investigate the private estimation of such potentials and transport maps with respect to the distribution samples.We propose a differentially private transport map estimator achieving an $L^2$ error of at most $n^{-1} \vee n^{-\frac{2 α}{2 α- 2 + d}} \vee (nε)^{-\frac{2 α}{2 α+ d}} $ up to poly-logarithmic terms where $n$ is the sample size, $ε$ is the desired level of privacy, $α$ is the smoothness of the true transport map, and $d$ is the dimension of the feature space. We also provide a lower bound for the problem.
△ Less
Submitted 3 February, 2025;
originally announced February 2025.
-
Locally sharp goodness-of-fit testing in sup norm for high-dimensional counts
Authors:
Subhodh Kotekal,
Julien Chhor,
Chao Gao
Abstract:
We consider testing the goodness-of-fit of a distribution against alternatives separated in sup norm. We study the twin settings of Poisson-generated count data with a large number of categories and high-dimensional multinomials. In previous studies of different separation metrics, it has been found that the local minimax separation rate exhibits substantial heterogeneity and is a complicated func…
▽ More
We consider testing the goodness-of-fit of a distribution against alternatives separated in sup norm. We study the twin settings of Poisson-generated count data with a large number of categories and high-dimensional multinomials. In previous studies of different separation metrics, it has been found that the local minimax separation rate exhibits substantial heterogeneity and is a complicated function of the null distribution; the rate-optimal test requires careful tailoring to the null. In the setting of sup norm, this remains the case and we establish that the local minimax separation rate is determined by the finer decay behavior of the category rates. The upper bound is obtained by a test involving the sample maximum, and the lower bound argument involves reducing the original heteroskedastic null to an auxiliary homoskedastic null determined by the decay of the rates. Further, in a particular asymptotic setup, the sharp constants are identified.
△ Less
Submitted 13 September, 2024;
originally announced September 2024.
-
Generalized multi-view model: Adaptive density estimation under low-rank constraints
Authors:
Julien Chhor,
Olga Klopp,
Alexandre Tsybakov
Abstract:
We study the problem of bivariate discrete or continuous probability density estimation under low-rank constraints.For discrete distributions, we assume that the two-dimensional array to estimate is a low-rank probability matrix. In the continuous case, we assume that the density with respect to the Lebesgue measure satisfies a generalized multi-view model, meaning that it is $β$-H{ö}lder and can…
▽ More
We study the problem of bivariate discrete or continuous probability density estimation under low-rank constraints.For discrete distributions, we assume that the two-dimensional array to estimate is a low-rank probability matrix. In the continuous case, we assume that the density with respect to the Lebesgue measure satisfies a generalized multi-view model, meaning that it is $β$-H{ö}lder and can be decomposed as a sum of $K$ components, each of which is a product of one-dimensional functions. In both settings, we propose estimators that achieve, up to logarithmic factors, the minimax optimal convergence rates under such low-rank constraints. In the discrete case, the proposed estimator is adaptive to the rank $K$. In the continuous case, our estimator converges with the $L_1$ rate $\min((K/n)^{β/(2β+1)}, n^{-β/(2β+2)})$ up to logarithmic factors, and it is adaptive to the unknown support as well as to the smoothness $β$ and to the unknown number of separable components $K$. We present efficient algorithms for computing our estimators.
△ Less
Submitted 22 October, 2024; v1 submitted 26 April, 2024;
originally announced April 2024.
-
Sparse Signal Detection in Heteroscedastic Gaussian Sequence Models: Sharp Minimax Rates
Authors:
Julien Chhor,
Rajarshi Mukherjee,
Subhabrata Sen
Abstract:
Given a heterogeneous Gaussian sequence model with unknown mean $θ\in \mathbb R^d$ and known covariance matrix $Σ= \operatorname{diag}(σ_1^2,\dots, σ_d^2)$, we study the signal detection problem against sparse alternatives, for known sparsity $s$. Namely, we characterize how large $ε^*>0$ should be, in order to distinguish with high probability the null hypothesis $θ=0$ from the alternative compos…
▽ More
Given a heterogeneous Gaussian sequence model with unknown mean $θ\in \mathbb R^d$ and known covariance matrix $Σ= \operatorname{diag}(σ_1^2,\dots, σ_d^2)$, we study the signal detection problem against sparse alternatives, for known sparsity $s$. Namely, we characterize how large $ε^*>0$ should be, in order to distinguish with high probability the null hypothesis $θ=0$ from the alternative composed of $s$-sparse vectors in $\mathbb R^d$, separated from $0$ in $L^t$ norm ($t \in [1,\infty]$) by at least $ε^*$. We find minimax upper and lower bounds over the minimax separation radius $ε^*$ and prove that they are always matching. We also derive the corresponding minimax tests achieving these bounds. Our results reveal new phase transitions regarding the behavior of $ε^*$ with respect to the level of sparsity, to the $L^t$ metric, and to the heteroscedasticity profile of $Σ$. In the case of the Euclidean (i.e. $L^2$) separation, we bridge the remaining gaps in the literature.
△ Less
Submitted 1 August, 2023; v1 submitted 15 November, 2022;
originally announced November 2022.
-
Benign overfitting and adaptive nonparametric regression
Authors:
Julien Chhor,
Suzanne Sigalla,
Alexandre B. Tsybakov
Abstract:
In the nonparametric regression setting, we construct an estimator which is a continuous function interpolating the data points with high probability, while attaining minimax optimal rates under mean squared risk on the scale of Hölder classes adaptively to the unknown smoothness.
In the nonparametric regression setting, we construct an estimator which is a continuous function interpolating the data points with high probability, while attaining minimax optimal rates under mean squared risk on the scale of Hölder classes adaptively to the unknown smoothness.
△ Less
Submitted 27 June, 2022;
originally announced June 2022.
-
Robust Estimation of Discrete Distributions under Local Differential Privacy
Authors:
Julien Chhor,
Flore Sentenac
Abstract:
Although robust learning and local differential privacy are both widely studied fields of research, combining the two settings is just starting to be explored. We consider the problem of estimating a discrete distribution in total variation from $n$ contaminated data batches under a local differential privacy constraint. A fraction $1-ε$ of the batches contain $k$ i.i.d. samples drawn from a discr…
▽ More
Although robust learning and local differential privacy are both widely studied fields of research, combining the two settings is just starting to be explored. We consider the problem of estimating a discrete distribution in total variation from $n$ contaminated data batches under a local differential privacy constraint. A fraction $1-ε$ of the batches contain $k$ i.i.d. samples drawn from a discrete distribution $p$ over $d$ elements. To protect the users' privacy, each of the samples is privatized using an $α$-locally differentially private mechanism. The remaining $εn $ batches are an adversarial contamination. The minimax rate of estimation under contamination alone, with no privacy, is known to be $ε/\sqrt{k}+\sqrt{d/kn}$, up to a $\sqrt{\log(1/ε)}$ factor. Under the privacy constraint alone, the minimax rate of estimation is $\sqrt{d^2/α^2 kn}$. We show that combining the two constraints leads to a minimax estimation rate of $ε\sqrt{d/α^2 k}+\sqrt{d^2/α^2 kn}$ up to a $\sqrt{\log(1/ε)}$ factor, larger than the sum of the two separate rates. We provide a polynomial-time algorithm achieving this bound, as well as a matching information theoretic lower bound.
△ Less
Submitted 20 April, 2022; v1 submitted 14 February, 2022;
originally announced February 2022.
-
Goodness-of-Fit Testing for Hölder-Continuous Densities: Sharp Local Minimax Rates
Authors:
Julien Chhor,
Alexandra Carpentier
Abstract:
We consider the goodness-of fit testing problem for Hölder smooth densities over $\mathbb{R}^d$: given $n$ iid observations with unknown density $p$ and given a known density $p_0$, we investigate how large $ρ$ should be to distinguish, with high probability, the case $p=p_0$ from the composite alternative of all Hölder-smooth densities $p$ such that $\|p-p_0\|_t \geq ρ$ where $t \in [1,2]$. The d…
▽ More
We consider the goodness-of fit testing problem for Hölder smooth densities over $\mathbb{R}^d$: given $n$ iid observations with unknown density $p$ and given a known density $p_0$, we investigate how large $ρ$ should be to distinguish, with high probability, the case $p=p_0$ from the composite alternative of all Hölder-smooth densities $p$ such that $\|p-p_0\|_t \geq ρ$ where $t \in [1,2]$. The densities are assumed to be defined over $\mathbb{R}^d$ and to have Hölder smoothness parameter $α>0$. In the present work, we solve the case $α\leq 1$ and handle the case $α>1$ using an additional technical restriction on the densities. We identify matching upper and lower bounds on the local minimax rates of testing, given explicitly in terms of $p_0$. We propose novel test statistics which we believe could be of independent interest. We also establish the first definition of an explicit cutoff $u_B$ allowing us to split $\mathbb{R}^d$ into a bulk part (defined as the subset of $\mathbb{R}^d$ where $p_0$ takes only values greater than or equal to $u_B$) and a tail part (defined as the complementary of the bulk), each part involving fundamentally different contributions to the local minimax rates of testing.
△ Less
Submitted 17 March, 2023; v1 submitted 9 September, 2021;
originally announced September 2021.
-
Sharp Local Minimax Rates for Goodness-of-Fit Testing in multivariate Binomial and Poisson families and in multinomials
Authors:
J. Chhor,
A. Carpentier
Abstract:
We consider the identity testing problem - or goodness-of-fit testing problem - in multivariate binomial families, multivariate Poisson families and multinomial distributions. Given a known distribution $p$ and $n$ iid samples drawn from an unknown distribution $q$, we investigate how large $ρ>0$ should be to distinguish, with high probability, the case $p=q$ from the case $d(p,q) \geq ρ$, where…
▽ More
We consider the identity testing problem - or goodness-of-fit testing problem - in multivariate binomial families, multivariate Poisson families and multinomial distributions. Given a known distribution $p$ and $n$ iid samples drawn from an unknown distribution $q$, we investigate how large $ρ>0$ should be to distinguish, with high probability, the case $p=q$ from the case $d(p,q) \geq ρ$, where $d$ denotes a specific distance over probability distributions. We answer this question in the case of a family of different distances: $d(p,q) = \|p-q\|_t$ for $t \in [1,2]$ where $\|\cdot\|_t$ is the entrywise $\ell_t$ norm. Besides being locally minimax-optimal - i.e. characterizing the detection threshold in dependence of the known matrix $p$ - our tests have simple expressions and are easily implementable.
△ Less
Submitted 23 April, 2022; v1 submitted 26 December, 2020;
originally announced December 2020.