Skip to main content

Showing 1–15 of 15 results for author: Zhang, R Y

Searching in archive stat. Search in all archives.
.
  1. arXiv:2505.03717  [pdf, other

    math.OC cs.LG stat.ML

    Nonnegative Low-rank Matrix Recovery Can Have Spurious Local Minima

    Authors: Richard Y. Zhang

    Abstract: The classical low-rank matrix recovery problem is well-known to exhibit \emph{benign nonconvexity} under the restricted isometry property (RIP): local optimization is guaranteed to converge to the global optimum, where the ground truth is recovered. We investigate whether benign nonconvexity continues to hold when the factor matrices are constrained to be elementwise nonnegative -- a common practi… ▽ More

    Submitted 6 May, 2025; originally announced May 2025.

  2. arXiv:2504.09708  [pdf, ps, other

    math.OC cs.LG stat.ML

    Preconditioned Gradient Descent for Over-Parameterized Nonconvex Matrix Factorization

    Authors: Gavin Zhang, Salar Fattahi, Richard Y. Zhang

    Abstract: In practical instances of nonconvex matrix factorization, the rank of the true solution $r^{\star}$ is often unknown, so the rank $r$ of the model can be overspecified as $r>r^{\star}$. This over-parameterized regime of matrix factorization significantly slows down the convergence of local search algorithms, from a linear rate with $r=r^{\star}$ to a sublinear rate when $r>r^{\star}$. We propose a… ▽ More

    Submitted 13 April, 2025; originally announced April 2025.

    Comments: NeurIPS 2021. See also https://proceedings.neurips.cc/paper/2021/hash/2f2cd5c753d3cee48e47dbb5bbaed331-Abstract.html

  3. arXiv:2305.18436  [pdf, other

    stat.ML cs.LG math.OC

    Statistically Optimal K-means Clustering via Nonnegative Low-rank Semidefinite Programming

    Authors: Yubo Zhuang, Xiaohui Chen, Yun Yang, Richard Y. Zhang

    Abstract: $K$-means clustering is a widely used machine learning method for identifying patterns in large datasets. Recently, semidefinite programming (SDP) relaxations have been proposed for solving the $K… ▽ More

    Submitted 13 April, 2024; v1 submitted 28 May, 2023; originally announced May 2023.

    Comments: Accepted to ICLR 2024

  4. arXiv:2305.17224  [pdf, other

    math.OC cs.LG stat.ML

    Fast and Accurate Estimation of Low-Rank Matrices from Noisy Measurements via Preconditioned Non-Convex Gradient Descent

    Authors: Gavin Zhang, Hong-Ming Chiu, Richard Y. Zhang

    Abstract: Non-convex gradient descent is a common approach for estimating a low-rank $n\times n$ ground truth matrix from noisy measurements, because it has per-iteration costs as low as $O(n)$ time, and is in theory capable of converging to a minimax optimal estimate. However, the practitioner is often constrained to just tens to hundreds of iterations, and the slow and/or inconsistent convergence of non-c… ▽ More

    Submitted 27 February, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

  5. arXiv:2211.17244  [pdf, other

    cs.LG math.OC stat.ML

    Tight Certification of Adversarially Trained Neural Networks via Nonconvex Low-Rank Semidefinite Relaxations

    Authors: Hong-Ming Chiu, Richard Y. Zhang

    Abstract: Adversarial training is well-known to produce high-quality neural network models that are empirically robust against adversarial perturbations. Nevertheless, once a model has been adversarially trained, one often desires a certification that the model is truly robust against all future attacks. Unfortunately, when faced with adversarially trained models, all existing approaches have significant tr… ▽ More

    Submitted 14 June, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

    Comments: ICML 2023

  6. arXiv:2208.11246  [pdf, ps, other

    cs.LG math.OC stat.ML

    Accelerating SGD for Highly Ill-Conditioned Huge-Scale Online Matrix Completion

    Authors: Gavin Zhang, Hong-Ming Chiu, Richard Y. Zhang

    Abstract: The matrix completion problem seeks to recover a $d\times d$ ground truth matrix of low rank $r\ll d$ from observations of its individual elements. Real-world matrix completion is often a huge-scale optimization problem, with $d$ so large that even the simplest full-dimension vector operations with $O(d)$ time complexity become prohibitively expensive. Stochastic gradient descent (SGD) is one of t… ▽ More

    Submitted 22 October, 2022; v1 submitted 23 August, 2022; originally announced August 2022.

    Comments: NeurIPS 2022

  7. arXiv:2207.01789  [pdf, other

    math.OC cs.LG stat.ML

    Improved Global Guarantees for the Nonconvex Burer--Monteiro Factorization via Rank Overparameterization

    Authors: Richard Y. Zhang

    Abstract: We consider minimizing a twice-differentiable, $L$-smooth, and $μ$-strongly convex objective $φ$ over an $n\times n$ positive semidefinite matrix $M\succeq0$, under the assumption that the minimizer $M^{\star}$ has low rank $r^{\star}\ll n$. Following the Burer--Monteiro approach, we instead minimize the nonconvex objective $f(X)=φ(XX^{T})$ over a factor matrix $X$ of size $n\times r$. This substa… ▽ More

    Submitted 8 July, 2024; v1 submitted 4 July, 2022; originally announced July 2022.

    Journal ref: Mathematical Programming, 2024

  8. arXiv:2206.03345  [pdf, other

    math.OC cs.LG stat.ML

    Preconditioned Gradient Descent for Overparameterized Nonconvex Burer--Monteiro Factorization with Global Optimality Certification

    Authors: Gavin Zhang, Salar Fattahi, Richard Y. Zhang

    Abstract: We consider using gradient descent to minimize the nonconvex function $f(X)=φ(XX^{T})$ over an $n\times r$ factor matrix $X$, in which $φ$ is an underlying smooth convex cost function defined over $n\times n$ matrices. While only a second-order stationary point $X$ can be provably found in reasonable time, if $X$ is additionally rank deficient, then its rank deficiency certifies it as being global… ▽ More

    Submitted 21 April, 2025; v1 submitted 7 June, 2022; originally announced June 2022.

    Comments: v2: accepted at JMLR. v3: minor correction in proof of Lemma 27

  9. arXiv:2104.10790  [pdf, other

    math.OC cs.LG stat.ML

    Sharp Global Guarantees for Nonconvex Low-rank Recovery in the Noisy Overparameterized Regime

    Authors: Richard Y. Zhang

    Abstract: Recent work established that rank overparameterization eliminates spurious local minima in nonconvex low-rank matrix recovery under the restricted isometry property (RIP). But this does not fully explain the practical success of overparameterization, because real algorithms can still become trapped at nonstrict saddle points (approximate second-order points with arbitrarily small negative curvatur… ▽ More

    Submitted 6 May, 2025; v1 submitted 21 April, 2021; originally announced April 2021.

    Comments: v2 corrects minor typos; v3 complete overhaul with new results on minimax-optimal recovery under noise and asymmetric factorization

  10. arXiv:2006.06915  [pdf, other

    cs.LG stat.ML

    How Many Samples is a Good Initial Point Worth in Low-rank Matrix Recovery?

    Authors: Gavin Zhang, Richard Y. Zhang

    Abstract: Given a sufficiently large amount of labeled data, the non-convex low-rank matrix recovery problem contains no spurious local minima, so a local optimization algorithm is guaranteed to converge to a global minimum starting from any initial guess. However, the actual amount of data needed by this theoretical guarantee is very pessimistic, as it must prevent spurious local minima from existing anywh… ▽ More

    Submitted 12 November, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: 16 pages, 2 figures

  11. arXiv:2006.06759  [pdf, other

    math.OC cs.LG stat.ML

    On the Tightness of Semidefinite Relaxations for Certifying Robustness to Adversarial Examples

    Authors: Richard Y. Zhang

    Abstract: The robustness of a neural network to adversarial examples can be provably certified by solving a convex relaxation. If the relaxation is loose, however, then the resulting certificate can be too conservative to be practically useful. Recently, a less conservative robustness certificate was proposed, based on a semidefinite programming (SDP) relaxation of the ReLU activation function. In this pape… ▽ More

    Submitted 26 October, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: NeurIPS 2020

  12. arXiv:1901.01631  [pdf, other

    cs.LG math.OC stat.ML

    Sharp Restricted Isometry Bounds for the Inexistence of Spurious Local Minima in Nonconvex Matrix Recovery

    Authors: Richard Y. Zhang, Somayeh Sojoudi, Javad Lavaei

    Abstract: Nonconvex matrix recovery is known to contain no spurious local minima under a restricted isometry property (RIP) with a sufficiently small RIP constant $δ$. If $δ$ is too large, however, then counterexamples containing spurious local minima are known to exist. In this paper, we introduce a proof technique that is capable of establishing sharp thresholds on $δ$ to guarantee the inexistence of spur… ▽ More

    Submitted 13 June, 2019; v1 submitted 6 January, 2019; originally announced January 2019.

    Comments: v2: fixed several typos; v3: accepted at JMLR

    Journal ref: Journal of Machine Learning Research 20 (114): 1-34, 2019

  13. arXiv:1805.10251  [pdf, other

    cs.LG math.OC stat.ML

    How Much Restricted Isometry is Needed In Nonconvex Matrix Recovery?

    Authors: Richard Y. Zhang, Cédric Josz, Somayeh Sojoudi, Javad Lavaei

    Abstract: When the linear measurements of an instance of low-rank matrix recovery satisfy a restricted isometry property (RIP)---i.e. they are approximately norm-preserving---the problem is known to contain no spurious local minima, so exact recovery is guaranteed. In this paper, we show that moderate RIP is not enough to eliminate spurious local minima, so existing results can only hold for near-perfect RI… ▽ More

    Submitted 30 October, 2018; v1 submitted 25 May, 2018; originally announced May 2018.

    Comments: 32nd Conference on Neural Information Processing Systems (NIPS 2018)

  14. arXiv:1802.04911  [pdf, ps, other

    stat.ML cs.LG math.OC stat.CO

    Large-Scale Sparse Inverse Covariance Estimation via Thresholding and Max-Det Matrix Completion

    Authors: Richard Y. Zhang, Salar Fattahi, Somayeh Sojoudi

    Abstract: The sparse inverse covariance estimation problem is commonly solved using an $\ell_{1}$-regularized Gaussian maximum likelihood estimator known as "graphical lasso", but its computational cost becomes prohibitive for large data sets. A recent line of results showed--under mild assumptions--that the graphical lasso estimator can be retrieved by soft-thresholding the sample covariance matrix and sol… ▽ More

    Submitted 6 June, 2018; v1 submitted 13 February, 2018; originally announced February 2018.

    Comments: 35-th International Conference on Machine Learning (ICML 2018)

  15. arXiv:1711.09131  [pdf, ps, other

    stat.ML stat.CO

    Sparse Inverse Covariance Estimation for Chordal Structures

    Authors: Salar Fattahi, Richard Y. Zhang, Somayeh Sojoudi

    Abstract: In this paper, we consider the Graphical Lasso (GL), a popular optimization problem for learning the sparse representations of high-dimensional datasets, which is well-known to be computationally expensive for large-scale problems. Recently, we have shown that the sparsity pattern of the optimal solution of GL is equivalent to the one obtained from simply thresholding the sample covariance matrix,… ▽ More

    Submitted 24 November, 2017; originally announced November 2017.