-
The Poisson tensor completion non-parametric differential entropy estimator
Authors:
Daniel M. Dunlavy,
Richard B. Lehoucq,
Carolyn D. Mayer,
Arvind Prasadan
Abstract:
We introduce the Poisson tensor completion (PTC) estimator, a non-parametric differential entropy estimator. The PTC estimator leverages inter-sample relationships to compute a low-rank Poisson tensor decomposition of the frequency histogram. Our crucial observation is that the histogram bins are an instance of a space partitioning of counts and thus can be identified with a spatial Poisson proces…
▽ More
We introduce the Poisson tensor completion (PTC) estimator, a non-parametric differential entropy estimator. The PTC estimator leverages inter-sample relationships to compute a low-rank Poisson tensor decomposition of the frequency histogram. Our crucial observation is that the histogram bins are an instance of a space partitioning of counts and thus can be identified with a spatial Poisson process. The Poisson tensor decomposition leads to a completion of the intensity measure over all bins -- including those containing few to no samples -- and leads to our proposed PTC differential entropy estimator. A Poisson tensor decomposition models the underlying distribution of the count data and guarantees non-negative estimated values and so can be safely used directly in entropy estimation. We believe our estimator is the first tensor-based estimator that exploits the underlying spatial Poisson process related to the histogram explicitly when estimating the probability density with low-rank tensor decompositions or tensor completion. Furthermore, we demonstrate that our PTC estimator is a substantial improvement over standard histogram-based estimators for sub-Gaussian probability distributions because of the concentration of norm phenomenon.
△ Less
Submitted 8 May, 2025; v1 submitted 8 May, 2025;
originally announced May 2025.
-
Information theoretic limits of robust sub-Gaussian mean estimation under star-shaped constraints
Authors:
Akshay Prasadan,
Matey Neykov
Abstract:
We obtain the minimax rate for a mean location model with a bounded star-shaped set $K \subseteq \mathbb{R}^n$ constraint on the mean, in an adversarially corrupted data setting with Gaussian noise. We assume an unknown fraction $ε\le 1/2-κ$ for some fixed $κ\in(0,1/2]$ of $N$ observations are arbitrarily corrupted. We obtain a minimax risk up to proportionality constants under the squared…
▽ More
We obtain the minimax rate for a mean location model with a bounded star-shaped set $K \subseteq \mathbb{R}^n$ constraint on the mean, in an adversarially corrupted data setting with Gaussian noise. We assume an unknown fraction $ε\le 1/2-κ$ for some fixed $κ\in(0,1/2]$ of $N$ observations are arbitrarily corrupted. We obtain a minimax risk up to proportionality constants under the squared $\ell_2$ loss of $\max(η^{*2},σ^2ε^2)\wedge d^2$ with \begin{align*}
η^* = \sup \bigg\{η\ge 0 : \frac{Nη^2}{σ^2} \leq \log \mathcal{M}_K^{\operatorname{loc}}(η,c)\bigg\}, \end{align*} where $\log \mathcal{M}_K^{\operatorname{loc}}(η,c)$ denotes the local entropy of the set $K$, $d$ is the diameter of $K$, $σ^2$ is the variance, and $c$ is some sufficiently large absolute constant. A variant of our algorithm achieves the same rate for settings with known or symmetric sub-Gaussian noise, with a smaller breakdown point, still of constant order. We further study the case of unknown sub-Gaussian noise and show that the rate is slightly slower: $\max(η^{*2},σ^2ε^2\log(1/ε))\wedge d^2$. We generalize our results to the case when $K$ is star-shaped but unbounded.
△ Less
Submitted 12 June, 2025; v1 submitted 4 December, 2024;
originally announced December 2024.
-
Some facts about the optimality of the LSE in the Gaussian sequence model with convex constraint
Authors:
Akshay Prasadan,
Matey Neykov
Abstract:
We consider a convex constrained Gaussian sequence model and characterize necessary and sufficient conditions for the least squares estimator (LSE) to be minimax optimal. For a closed convex set $K\subset \mathbb{R}^n$ we observe $Y=μ+ξ$ for $ξ\sim \mathcal{N}(0,σ^2\mathbb{I}_n)$ and $μ\in K$ and aim to estimate $μ$. We characterize the worst case risk of the LSE in multiple ways by analyzing the…
▽ More
We consider a convex constrained Gaussian sequence model and characterize necessary and sufficient conditions for the least squares estimator (LSE) to be minimax optimal. For a closed convex set $K\subset \mathbb{R}^n$ we observe $Y=μ+ξ$ for $ξ\sim \mathcal{N}(0,σ^2\mathbb{I}_n)$ and $μ\in K$ and aim to estimate $μ$. We characterize the worst case risk of the LSE in multiple ways by analyzing the behavior of the local Gaussian width on $K$. We demonstrate that optimality is equivalent to a Lipschitz property of the local Gaussian width mapping. We also provide theoretical algorithms that search for the worst case risk. We then provide examples showing optimality or suboptimality of the LSE on various sets, including $\ell_p$ balls for $p\in[1,2]$, pyramids, solids of revolution, and multivariate isotonic regression, among others.
△ Less
Submitted 3 February, 2025; v1 submitted 9 June, 2024;
originally announced June 2024.
-
The Average Spectrum Norm and Near-Optimal Tensor Completion
Authors:
Oscar López,
Richard Lehoucq,
Carlos Llosa-Vite,
Arvind Prasadan,
Daniel M. Dunlavy
Abstract:
We introduce a new tensor norm, the average spectrum norm, to study sample complexity of tensor completion problems based on the canonical polyadic decomposition (CPD). Properties of the average spectrum norm and its dual norm are investigated, demonstrating their utility for low-rank tensor recovery analysis. Our novel approach significantly reduces the provable sample rate for CPD-based noisy te…
▽ More
We introduce a new tensor norm, the average spectrum norm, to study sample complexity of tensor completion problems based on the canonical polyadic decomposition (CPD). Properties of the average spectrum norm and its dual norm are investigated, demonstrating their utility for low-rank tensor recovery analysis. Our novel approach significantly reduces the provable sample rate for CPD-based noisy tensor completion, providing the best bounds to date on the number of observed noisy entries required to produce an arbitrarily accurate estimate of an underlying mean value tensor. Under Poisson and Bernoulli multivariate distributions, we show that an $N$-way CPD rank-$R$ parametric tensor $\boldsymbol{\mathscr{M}}\in\mathbb{R}^{I\times \cdots\times I}$ generating noisy observations can be approximated by large likelihood estimators from $\mathcal{O}(IR^2\log^{N+2}(I))$ revealed entries. Furthermore, under nonnegative and orthogonal versions of the CPD we improve the result to depend linearly on the rank, achieving the near-optimal rate $\mathcal{O}(IR\log^{N+2}(I))$.
△ Less
Submitted 17 June, 2024; v1 submitted 15 April, 2024;
originally announced April 2024.
-
Characterizing the minimax rate of nonparametric regression under bounded star-shaped constraints
Authors:
Akshay Prasadan,
Matey Neykov
Abstract:
We quantify the minimax rate for a nonparametric regression model over a star-shaped function class $\mathcal{F}$ with bounded diameter. We obtain a minimax rate of ${\varepsilon^{\ast}}^2\wedge\mathrm{diam}(\mathcal{F})^2$ where \[\varepsilon^{\ast} =\sup\{\varepsilon\ge 0:n\varepsilon^2 \le \log M_{\mathcal{F}}^{\operatorname{loc}}(\varepsilon,c)\},\] where…
▽ More
We quantify the minimax rate for a nonparametric regression model over a star-shaped function class $\mathcal{F}$ with bounded diameter. We obtain a minimax rate of ${\varepsilon^{\ast}}^2\wedge\mathrm{diam}(\mathcal{F})^2$ where \[\varepsilon^{\ast} =\sup\{\varepsilon\ge 0:n\varepsilon^2 \le \log M_{\mathcal{F}}^{\operatorname{loc}}(\varepsilon,c)\},\] where $\log M_{\mathcal{F}}^{\operatorname{loc}}(\cdot, c)$ is the local metric entropy of $\mathcal{F}$, $c$ is some absolute constant scaling down the entropy radius, and our loss function is the squared population $L_2$ distance over our input space $\mathcal{X}$. In contrast to classical works on the topic [cf. Yang and Barron, 1999], our results do not require functions in $\mathcal{F}$ to be uniformly bounded in sup-norm. In fact, we propose a condition that simultaneously generalizes boundedness in sup-norm and the so-called $L$-sub-Gaussian assumption that appears in the prior literature. In addition, we prove that our estimator is adaptive to the true point in the convex-constrained case, and to the best of our knowledge this is the first such estimator in this general setting. This work builds on the Gaussian sequence framework of Neykov [2022] using a similar algorithmic scheme to achieve the minimax rate. Our algorithmic rate also applies with sub-Gaussian noise. We illustrate the utility of this theory with examples including multivariate monotone functions, linear functionals over ellipsoids, and Lipschitz classes.
△ Less
Submitted 27 June, 2025; v1 submitted 15 January, 2024;
originally announced January 2024.
-
Sparse Equisigned PCA: Algorithms and Performance Bounds in the Noisy Rank-1 Setting
Authors:
Arvind Prasadan,
Raj Rao Nadakuditi,
Debashis Paul
Abstract:
Singular value decomposition (SVD) based principal component analysis (PCA) breaks down in the high-dimensional and limited sample size regime below a certain critical eigen-SNR that depends on the dimensionality of the system and the number of samples. Below this critical eigen-SNR, the estimates returned by the SVD are asymptotically uncorrelated with the latent principal components. We consider…
▽ More
Singular value decomposition (SVD) based principal component analysis (PCA) breaks down in the high-dimensional and limited sample size regime below a certain critical eigen-SNR that depends on the dimensionality of the system and the number of samples. Below this critical eigen-SNR, the estimates returned by the SVD are asymptotically uncorrelated with the latent principal components. We consider a setting where the left singular vector of the underlying rank one signal matrix is assumed to be sparse and the right singular vector is assumed to be equisigned, that is, having either only nonnegative or only nonpositive entries. We consider six different algorithms for estimating the sparse principal component based on different statistical criteria and prove that by exploiting sparsity, we recover consistent estimates in the low eigen-SNR regime where the SVD fails. Our analysis reveals conditions under which a coordinate selection scheme based on a \textit{sum-type decision statistic} outperforms schemes that utilize the $\ell_1$ and $\ell_2$ norm-based statistics. We derive lower bounds on the size of detectable coordinates of the principal left singular vector and utilize these lower bounds to derive lower bounds on the worst-case risk. Finally, we verify our findings with numerical simulations and illustrate the performance with a video data example, where the interest is in identifying objects.
△ Less
Submitted 16 December, 2019; v1 submitted 22 May, 2019;
originally announced May 2019.
-
Time Series Source Separation using Dynamic Mode Decomposition
Authors:
Arvind Prasadan,
Raj Rao Nadakuditi
Abstract:
The Dynamic Mode Decomposition (DMD) extracted dynamic modes are the non-orthogonal eigenvectors of the matrix that best approximates the one-step temporal evolution of the multivariate samples. In the context of dynamical system analysis, the extracted dynamic modes are a generalization of global stability modes. We apply DMD to a data matrix whose rows are linearly independent, additive mixtures…
▽ More
The Dynamic Mode Decomposition (DMD) extracted dynamic modes are the non-orthogonal eigenvectors of the matrix that best approximates the one-step temporal evolution of the multivariate samples. In the context of dynamical system analysis, the extracted dynamic modes are a generalization of global stability modes. We apply DMD to a data matrix whose rows are linearly independent, additive mixtures of latent time series. We show that when the latent time series are uncorrelated at a lag of one time-step then, in the large sample limit, the recovered dynamic modes will approximate, up to a column-wise normalization, the columns of the mixing matrix. Thus, DMD is a time series blind source separation algorithm in disguise, but is different from closely related second order algorithms such as the Second-Order Blind Identification (SOBI) method and the Algorithm for Multiple Unknown Signals Extraction (AMUSE). All can unmix mixed stationary, ergodic Gaussian time series in a way that kurtosis-based Independent Components Analysis (ICA) fundamentally cannot. We use our insights on single lag DMD to develop a higher-lag extension, analyze the finite sample performance with and without randomly missing data, and identify settings where the higher lag variant can outperform the conventional single lag variant. We validate our results with numerical simulations, and highlight how DMD can be used in change point detection.
△ Less
Submitted 5 March, 2020; v1 submitted 4 March, 2019;
originally announced March 2019.