Skip to main content

Showing 1–22 of 22 results for author: Zhang, A R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2411.15660  [pdf, other

    math.ST cs.IT stat.ML

    Federated PCA and Estimation for Spiked Covariance Matrices: Optimal Rates and Efficient Algorithm

    Authors: Jingyang Li, T. Tony Cai, Dong Xia, Anru R. Zhang

    Abstract: Federated Learning (FL) has gained significant recent attention in machine learning for its enhanced privacy and data security, making it indispensable in fields such as healthcare, finance, and personalized services. This paper investigates federated PCA and estimation for spiked covariance matrices under distributed differential privacy constraints. We establish minimax rates of convergence, wit… ▽ More

    Submitted 23 November, 2024; originally announced November 2024.

  2. arXiv:2410.14046  [pdf, other

    stat.ML cs.LG math.NA stat.CO stat.ME

    Tensor Decomposition with Unaligned Observations

    Authors: Runshi Tang, Tamara Kolda, Anru R. Zhang

    Abstract: This paper presents a canonical polyadic (CP) tensor decomposition that addresses unaligned observations. The mode with unaligned observations is represented using functions in a reproducing kernel Hilbert space (RKHS). We introduce a versatile loss function that effectively accounts for various types of data, including binary, integer-valued, and positive-valued types. Additionally, we propose an… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  3. arXiv:2408.05677  [pdf, other

    math.NA cs.LG

    Tensor Decomposition Meets RKHS: Efficient Algorithms for Smooth and Misaligned Data

    Authors: Brett W. Larsen, Tamara G. Kolda, Anru R. Zhang, Alex H. Williams

    Abstract: The canonical polyadic (CP) tensor decomposition decomposes a multidimensional data array into a sum of outer products of finite-dimensional vectors. Instead, we can replace some or all of the vectors with continuous functions (infinite-dimensional vectors) from a reproducing kernel Hilbert space (RKHS). We refer to tensors with some infinite-dimensional modes as quasitensors, and the approach of… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

  4. arXiv:2310.15290  [pdf, other

    cs.LG

    Reliable Generation of Privacy-preserving Synthetic Electronic Health Record Time Series via Diffusion Models

    Authors: Muhang Tian, Bernie Chen, Allan Guo, Shiyi Jiang, Anru R. Zhang

    Abstract: Electronic Health Records (EHRs) are rich sources of patient-level data, offering valuable resources for medical data analysis. However, privacy concerns often restrict access to EHRs, hindering downstream analysis. Current EHR de-identification methods are flawed and can lead to potential privacy leakage. Additionally, existing publicly available EHR databases are limited, preventing the advancem… ▽ More

    Submitted 2 December, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

  5. arXiv:2307.00575  [pdf, other

    stat.ME cs.LG math.NA math.ST

    Mode-wise Principal Subspace Pursuit and Matrix Spiked Covariance Model

    Authors: Runshi Tang, Ming Yuan, Anru R. Zhang

    Abstract: This paper introduces a novel framework called Mode-wise Principal Subspace Pursuit (MOP-UP) to extract hidden variations in both the row and column dimensions for matrix data. To enhance the understanding of the framework, we introduce a class of matrix-variate spiked covariance models that serve as inspiration for the development of the MOP-UP algorithm. The MOP-UP algorithm consists of two step… ▽ More

    Submitted 4 August, 2024; v1 submitted 2 July, 2023; originally announced July 2023.

    Comments: Journal of the Royal Statistical Society, Series B, to appear

  6. arXiv:2303.05024  [pdf, other

    math.ST cs.LG cs.SI stat.ML

    Phase transition for detecting a small community in a large network

    Authors: Jiashun Jin, Zheng Tracy Ke, Paxton Turner, Anru R. Zhang

    Abstract: How to detect a small community in a large network is an interesting problem, including clique detection as a special case, where a naive degree-based $χ^2$-test was shown to be powerful in the presence of an Erdős-Renyi background. Using Sinkhorn's theorem, we show that the signal captured by the $χ^2$-test may be a modeling artifact, and it may disappear once we replace the Erdős-Renyi model by… ▽ More

    Submitted 8 March, 2023; originally announced March 2023.

  7. arXiv:2209.12715  [pdf, other

    cs.CV cs.LG stat.AP stat.ML

    Enhancing convolutional neural network generalizability via low-rank weight approximation

    Authors: Chenyin Gao, Shu Yang, Anru R. Zhang

    Abstract: Noise is ubiquitous during image acquisition. Sufficient denoising is often an important first step for image processing. In recent decades, deep neural networks (DNNs) have been widely used for image denoising. Most DNN-based image denoising methods require a large-scale dataset or focus on supervised settings, in which single/pairs of clean images or a set of noisy images are required. This pose… ▽ More

    Submitted 1 August, 2024; v1 submitted 26 September, 2022; originally announced September 2022.

    Comments: accepted by IET Image Processing

  8. arXiv:2209.11215  [pdf, ps, other

    cs.LG math.ST

    Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions

    Authors: Sitan Chen, Sinho Chewi, Jerry Li, Yuanzhi Li, Adil Salim, Anru R. Zhang

    Abstract: We provide theoretical convergence guarantees for score-based generative models (SGMs) such as denoising diffusion probabilistic models (DDPMs), which constitute the backbone of large-scale real-world generative models such as DALL$\cdot$E 2. Our main result is that, assuming accurate score estimates, such SGMs can efficiently sample from essentially any realistic data distribution. In contrast to… ▽ More

    Submitted 15 April, 2023; v1 submitted 22 September, 2022; originally announced September 2022.

    Comments: 29 pages

  9. arXiv:2206.08756  [pdf, other

    math.ST cs.LG math.OC stat.ME stat.ML

    Tensor-on-Tensor Regression: Riemannian Optimization, Over-parameterization, Statistical-computational Gap, and Their Interplay

    Authors: Yuetian Luo, Anru R. Zhang

    Abstract: We study the tensor-on-tensor regression, where the goal is to connect tensor responses to tensor covariates with a low Tucker rank parameter tensor/matrix without the prior knowledge of its intrinsic rank. We propose the Riemannian gradient descent (RGD) and Riemannian Gauss-Newton (RGN) methods and cope with the challenge of unknown rank by studying the effect of rank over-parameterization. We p… ▽ More

    Submitted 15 January, 2024; v1 submitted 17 June, 2022; originally announced June 2022.

  10. arXiv:2204.04209  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Learning Polynomial Transformations

    Authors: Sitan Chen, Jerry Li, Yuanzhi Li, Anru R. Zhang

    Abstract: We consider the problem of learning high dimensional polynomial transformations of Gaussians. Given samples of the form $p(x)$, where $x\sim N(0, \mathrm{Id}_r)$ is hidden and $p: \mathbb{R}^r \to \mathbb{R}^d$ is a function where every output coordinate is a low-degree polynomial, the goal is to learn the distribution over $p(x)$. This problem is natural in its own right, but is also an important… ▽ More

    Submitted 8 April, 2022; originally announced April 2022.

    Comments: 121 pages, comments welcome

  11. arXiv:2110.12121  [pdf, ps, other

    math.OC cs.IT cs.LG eess.SP

    On Geometric Connections of Embedded and Quotient Geometries in Riemannian Fixed-rank Matrix Optimization

    Authors: Yuetian Luo, Xudong Li, Anru R. Zhang

    Abstract: In this paper, we propose a general procedure for establishing the geometric landscape connections of a Riemannian optimization problem under the embedded and quotient geometries. By applying the general procedure to the fixed-rank positive semidefinite (PSD) and general matrix optimization, we establish an exact Riemannian gradient connection under two geometries at every point on the manifold an… ▽ More

    Submitted 10 April, 2023; v1 submitted 22 October, 2021; originally announced October 2021.

  12. arXiv:2108.01772  [pdf, other

    math.OC cs.IT cs.LG eess.SP stat.ML

    Nonconvex Factorization and Manifold Formulations are Almost Equivalent in Low-rank Matrix Optimization

    Authors: Yuetian Luo, Xudong Li, Anru R. Zhang

    Abstract: In this paper, we consider the geometric landscape connection of the widely studied manifold and factorization formulations in low-rank positive semidefinite (PSD) and general matrix optimization. We establish a sandwich relation on the spectrum of Riemannian and Euclidean Hessians at first-order stationary points (FOSPs). As a result of that, we obtain an equivalence on the set of FOSPs, second-o… ▽ More

    Submitted 12 August, 2024; v1 submitted 3 August, 2021; originally announced August 2021.

  13. arXiv:2104.12031  [pdf, other

    stat.ML cs.LG math.NA math.OC stat.ME

    Low-rank Tensor Estimation via Riemannian Gauss-Newton: Statistical Optimality and Second-Order Convergence

    Authors: Yuetian Luo, Anru R. Zhang

    Abstract: In this paper, we consider the estimation of a low Tucker rank tensor from a number of noisy linear measurements. The general problem covers many specific examples arising from applications, including tensor regression, tensor completion, and tensor PCA/SVD. We consider an efficient Riemannian Gauss-Newton (RGN) method for low Tucker rank tensor estimation. Different from the generic (super)linear… ▽ More

    Submitted 8 July, 2023; v1 submitted 24 April, 2021; originally announced April 2021.

  14. arXiv:2012.14844  [pdf, other

    math.ST cs.LG stat.ME stat.ML

    Inference for Low-rank Tensors -- No Need to Debias

    Authors: Dong Xia, Anru R. Zhang, Yuchen Zhou

    Abstract: In this paper, we consider the statistical inference for several low-rank tensor models. Specifically, in the Tucker low-rank tensor PCA or regression model, provided with any estimates achieving some attainable error rate, we develop the data-driven confidence regions for the singular subspace of the parameter tensor based on the asymptotic distribution of an updated estimate by two-iteration alt… ▽ More

    Submitted 29 October, 2021; v1 submitted 29 December, 2020; originally announced December 2020.

    Comments: to appear at the Annals of Statistics

  15. arXiv:2011.08360  [pdf, other

    math.OC cs.LG math.NA stat.CO stat.ML

    Recursive Importance Sketching for Rank Constrained Least Squares: Algorithms and High-order Convergence

    Authors: Yuetian Luo, Wen Huang, Xudong Li, Anru R. Zhang

    Abstract: In this paper, we propose {\it \underline{R}ecursive} {\it \underline{I}mportance} {\it \underline{S}ketching} algorithm for {\it \underline{R}ank} constrained least squares {\it \underline{O}ptimization} (RISRO). The key step of RISRO is recursive importance sketching, a new sketching framework based on deterministically designed recursive projections, which significantly differs from the randomi… ▽ More

    Submitted 4 December, 2022; v1 submitted 16 November, 2020; originally announced November 2020.

    Comments: Accepted by Operations Research

  16. arXiv:2010.02482  [pdf, other

    math.ST cs.LG math.NA stat.CO stat.ME

    Optimal High-order Tensor SVD via Tensor-Train Orthogonal Iteration

    Authors: Yuchen Zhou, Anru R. Zhang, Lili Zheng, Yazhen Wang

    Abstract: This paper studies a general framework for high-order tensor SVD. We propose a new computationally efficient algorithm, tensor-train orthogonal iteration (TTOI), that aims to estimate the low tensor-train rank structure from the noisy high-order tensor observation. The proposed TTOI consists of initialization via TT-SVD (Oseledets, 2011) and new iterative backward/forward updates. We develop the g… ▽ More

    Submitted 24 January, 2022; v1 submitted 6 October, 2020; originally announced October 2020.

    Comments: to appear in IEEE Transactions on Information Theory

  17. arXiv:2009.05870  [pdf, ps, other

    stat.ML cs.CC cs.LG math.CO

    Open Problem: Average-Case Hardness of Hypergraphic Planted Clique Detection

    Authors: Yuetian Luo, Anru R. Zhang

    Abstract: We note the significance of hypergraphic planted clique (HPC) detection in the investigation of computational hardness for a range of tensor problems. We ask if more evidence for the computational hardness of HPC detection can be developed. In particular, we conjecture if it is possible to establish the equivalence of the computational hardness between HPC and PC detection.

    Submitted 12 September, 2020; originally announced September 2020.

    Comments: Published at Proceedings of Conference on Learning Theory, 2020

  18. arXiv:2008.02437  [pdf, other

    math.ST cs.LG math.NA stat.ML

    A Sharp Blockwise Tensor Perturbation Bound for Orthogonal Iteration

    Authors: Yuetian Luo, Garvesh Raskutti, Ming Yuan, Anru R. Zhang

    Abstract: In this paper, we develop novel perturbation bounds for the high-order orthogonal iteration (HOOI) [DLDMV00b]. Under mild regularity conditions, we establish blockwise tensor perturbation bounds for HOOI with guarantees for both tensor reconstruction in Hilbert-Schmidt norm $\|\widehat{\bcT} - \bcT \|_{\tHS}$ and mode-$k$ singular subspace estimation in Schatten-$q$ norm… ▽ More

    Submitted 5 June, 2021; v1 submitted 5 August, 2020; originally announced August 2020.

  19. arXiv:2005.10743  [pdf, other

    math.ST cs.CC cs.LG stat.ME stat.ML

    Tensor Clustering with Planted Structures: Statistical Optimality and Computational Limits

    Authors: Yuetian Luo, Anru R. Zhang

    Abstract: This paper studies the statistical and computational limits of high-order clustering with planted structures. We focus on two clustering models, constant high-order clustering (CHC) and rank-one higher-order clustering (ROHC), and study the methods and theory for testing whether a cluster exists (detection) and identifying the support of cluster (recovery). Specifically, we identify the sharp bo… ▽ More

    Submitted 2 October, 2023; v1 submitted 21 May, 2020; originally announced May 2020.

    Comments: Done a few clarifications and added low-degree polynomial based evidence for HPDS recovery conjecture 2

  20. arXiv:2002.11255  [pdf, other

    math.ST cs.LG stat.ME stat.ML

    An Optimal Statistical and Computational Framework for Generalized Tensor Estimation

    Authors: Rungang Han, Rebecca Willett, Anru R. Zhang

    Abstract: This paper describes a flexible framework for generalized low-rank tensor estimation problems that includes many important instances arising from applications in computational imaging, genomics, and network analysis. The proposed estimator consists of finding a low-rank tensor fit to the data under generalized parametric models. To overcome the difficulty of non-convexity in these problems, we int… ▽ More

    Submitted 4 February, 2021; v1 submitted 25 February, 2020; originally announced February 2020.

  21. arXiv:1909.09851  [pdf, other

    math.ST cs.LG stat.ML

    Sparse Group Lasso: Optimal Sample Complexity, Convergence Rate, and Statistical Inference

    Authors: T. Tony Cai, Anru R. Zhang, Yuchen Zhou

    Abstract: We study sparse group Lasso for high-dimensional double sparse linear regression, where the parameter of interest is simultaneously element-wise and group-wise sparse. This problem is an important instance of the simultaneously structured model -- an actively studied topic in statistics and machine learning. In the noiseless case, matching upper and lower bounds on sample complexity are establishe… ▽ More

    Submitted 6 May, 2022; v1 submitted 21 September, 2019; originally announced September 2019.

    Comments: IEEE Transactions on Information Theory, to appear

  22. arXiv:1810.09006  [pdf, ps, other

    math.PR cs.LG math.ST

    On the Non-asymptotic and Sharp Lower Tail Bounds of Random Variables

    Authors: Anru R. Zhang, Yuchen Zhou

    Abstract: The non-asymptotic tail bounds of random variables play crucial roles in probability, statistics, and machine learning. Despite much success in developing upper bounds on tail probability in literature, the lower bounds on tail probabilities are relatively fewer. In this paper, we introduce systematic and user-friendly schemes for developing non-asymptotic lower bounds of tail probabilities. In ad… ▽ More

    Submitted 4 September, 2020; v1 submitted 21 October, 2018; originally announced October 2018.