-
Sensor network localization has a benign landscape after low-dimensional relaxation
Authors:
Christopher Criscitiello,
Andrew D. McRae,
Quentin Rebjock,
Nicolas Boumal
Abstract:
We consider the sensor network localization problem, also known as multidimensional scaling or Euclidean distance matrix completion. Given a ground truth configuration of $n$ points in $\mathbb{R}^\ell$, we observe a subset of the pairwise distances and aim to recover the underlying configuration (up to rigid transformations). We show with a simple counterexample that the associated optimization p…
▽ More
We consider the sensor network localization problem, also known as multidimensional scaling or Euclidean distance matrix completion. Given a ground truth configuration of $n$ points in $\mathbb{R}^\ell$, we observe a subset of the pairwise distances and aim to recover the underlying configuration (up to rigid transformations). We show with a simple counterexample that the associated optimization problem is nonconvex and may admit spurious local minimizers, even when all distances are known. Yet, inspired by numerical experiments, we argue that all second-order critical points become global minimizers when the problem is relaxed by optimizing over configurations in dimension $k > \ell$. Specifically, we show this for two settings, both when all pairwise distances are known: (1) for arbitrary ground truth points, and $k= O(\sqrt{\ell n})$, and: (2) for isotropic random ground truth points, and $k = O(\ell + \log n)$. To prove these results, we identify and exploit key properties of the linear map which sends inner products to squared distances.
△ Less
Submitted 21 July, 2025;
originally announced July 2025.
-
Phase retrieval and matrix sensing via benign and overparametrized nonconvex optimization
Authors:
Andrew D. McRae
Abstract:
We study a nonconvex optimization algorithmic approach to phase retrieval and the more general problem of semidefinite low-rank matrix sensing. Specifically, we analyze the nonconvex landscape of a quartic Burer-Monteiro factored least-squares optimization problem. We develop a new analysis framework, taking advantage of the semidefinite problem structure, to understand the properties of second-or…
▽ More
We study a nonconvex optimization algorithmic approach to phase retrieval and the more general problem of semidefinite low-rank matrix sensing. Specifically, we analyze the nonconvex landscape of a quartic Burer-Monteiro factored least-squares optimization problem. We develop a new analysis framework, taking advantage of the semidefinite problem structure, to understand the properties of second-order critical points -- specifically, whether they (approximately) recover the ground truth matrix. We show that it can be helpful to (mildly) overparametrize the problem, that is, to optimize over matrices of higher rank than the ground truth. We then apply this framework to several well-studied problem instances: in addition to recovering existing state-of-the-art phase retrieval landscape guarantees (without overparametrization), we show that overparametrizing by a factor at most logarithmic in the dimension allows recovery with optimal statistical sample complexity and error for the problems of (1) phase retrieval with sub-Gaussian measurements and (2) more general semidefinite matrix sensing with rank-1 Gaussian measurements. Previously, such statistical results had been shown only for estimators based on semidefinite programming. More generally, our analysis is partially based on the powerful method of convex dual certificates, suggesting that it could be applied to a much wider class of problems.
△ Less
Submitted 12 July, 2025; v1 submitted 5 May, 2025;
originally announced May 2025.
-
Benign landscapes for synchronization on spheres via normalized Laplacian matrices
Authors:
Andrew D. McRae
Abstract:
We study the nonconvex optimization landscapes of synchronization problems on spheres. First, we present new results for the statistical problem of synchronization over the two-element group $\mathbf{Z}_2$. We consider the nonconvex least-squares problem with $\mathbf{Z}_2 = \{\pm 1\}$ relaxed to the unit sphere in $\mathbf{R}^r$ for $r \geq 2$; for several popular models, including graph clusteri…
▽ More
We study the nonconvex optimization landscapes of synchronization problems on spheres. First, we present new results for the statistical problem of synchronization over the two-element group $\mathbf{Z}_2$. We consider the nonconvex least-squares problem with $\mathbf{Z}_2 = \{\pm 1\}$ relaxed to the unit sphere in $\mathbf{R}^r$ for $r \geq 2$; for several popular models, including graph clustering under the binary stochastic block model, we show that, for any $r \geq 2$, every second-order critical point recovers the ground truth in the asymptotic regimes where exact recovery is information-theoretically possible. Such statistical optimality via spherical relaxations had previously only been shown for (potentially arbitrarily) larger relaxation dimension $r$. Second, we consider the global synchronization of networks of coupled oscillators under the (homogeneous) Kuramoto model. We prove new and optimal asymptotic results for random signed networks on an Erdős--Rényi graph, and we give new and simple proofs for several existing state-of-the-art results. Our key tool is a deterministic landscape condition that extends a recent result of Rakoto Endor and Waldspurger. This result says that, if a certain problem-dependent Laplacian matrix has small enough condition number, the nonconvex landscape is benign. Our extension allows the condition number to include an arbitrary diagonal preconditioner, which gives tighter results for many problems. We show that, for the synchronization of Kuramoto oscillator networks on nearest-neighbor circulant graphs as studied by Wiley, Strogatz, and Girvan, this condition is optimal. We also prove a natural complex extension that may be of interest for synchronization on the special orthogonal group $\operatorname{SO}(2)$.
△ Less
Submitted 24 March, 2025;
originally announced March 2025.
-
Nonconvex landscapes for $\mathbf{Z}_2$ synchronization and graph clustering are benign near exact recovery thresholds
Authors:
Andrew D. McRae,
Pedro Abdalla,
Afonso S. Bandeira,
Nicolas Boumal
Abstract:
We study the optimization landscape of a smooth nonconvex program arising from synchronization over the two-element group $\mathbf{Z}_2$, that is, recovering $z_1, \dots, z_n \in \{\pm 1\}$ from (noisy) relative measurements $R_{ij} \approx z_i z_j$. Starting from a max-cut--like combinatorial problem, for integer parameter $r \geq 2$, the nonconvex problem we study can be viewed both as a rank-…
▽ More
We study the optimization landscape of a smooth nonconvex program arising from synchronization over the two-element group $\mathbf{Z}_2$, that is, recovering $z_1, \dots, z_n \in \{\pm 1\}$ from (noisy) relative measurements $R_{ij} \approx z_i z_j$. Starting from a max-cut--like combinatorial problem, for integer parameter $r \geq 2$, the nonconvex problem we study can be viewed both as a rank-$r$ Burer--Monteiro factorization of the standard max-cut semidefinite relaxation and as a relaxation of $\{ \pm 1 \}$ to the unit sphere in $\mathbf{R}^r$. First, we present deterministic, non-asymptotic conditions on the measurement graph and noise under which every second-order critical point of the nonconvex problem yields exact recovery of the ground truth. Then, via probabilistic analysis, we obtain asymptotic guarantees for three benchmark problems: (1) synchronization with a complete graph and Gaussian noise, (2) synchronization with an Erdős--Rényi random graph and Bernoulli noise, and (3) graph clustering under the binary symmetric stochastic block model. In each case, we have, asymptotically as the problem size goes to infinity, a benign nonconvex landscape near a previously-established optimal threshold for exact recovery; we can approach this threshold to arbitrary precision with large enough (but finite) rank parameter $r$. In addition, our results are robust to monotone adversaries.
△ Less
Submitted 18 July, 2024;
originally announced July 2024.
-
Synchronization on circles and spheres with nonlinear interactions
Authors:
Christopher Criscitiello,
Quentin Rebjock,
Andrew D. McRae,
Nicolas Boumal
Abstract:
We consider the dynamics of $n$ points on a sphere in $\mathbb{R}^d$ ($d \geq 2$) which attract each other according to a function $\varphi$ of their inner products. When $\varphi$ is linear ($\varphi(t) = t$), the points converge to a common value (i.e., synchronize) in various connectivity scenarios: this is part of classical work on Kuramoto oscillator networks. When $\varphi$ is exponential (…
▽ More
We consider the dynamics of $n$ points on a sphere in $\mathbb{R}^d$ ($d \geq 2$) which attract each other according to a function $\varphi$ of their inner products. When $\varphi$ is linear ($\varphi(t) = t$), the points converge to a common value (i.e., synchronize) in various connectivity scenarios: this is part of classical work on Kuramoto oscillator networks. When $\varphi$ is exponential ($\varphi(t) = e^{βt}$), these dynamics correspond to a limit of how idealized transformers process data, as described by Geshkovski et al. (2024). Accordingly, they ask whether synchronization occurs for exponential $\varphi$.
In the context of consensus for multi-agent control, Markdahl et al. (2018) show that for $d \geq 3$ (spheres), if the interaction graph is connected and $\varphi$ is increasing and convex, then the system synchronizes. What is the situation on circles ($d=2$)? First, we show that $\varphi$ being increasing and convex is no longer sufficient. Then we identify a new condition (that the Taylor coefficients of $\varphi'$ are decreasing) under which we do have synchronization on the circle. In so doing, we provide some answers to the open problems posed by Geshkovski et al. (2024).
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Low solution rank of the matrix LASSO under RIP with consequences for rank-constrained algorithms
Authors:
Andrew D. McRae
Abstract:
We show that solutions to the popular convex matrix LASSO problem (nuclear-norm--penalized linear least-squares) have low rank under similar assumptions as required by classical low-rank matrix sensing error bounds. Although the purpose of the nuclear norm penalty is to promote low solution rank, a proof has not yet (to our knowledge) been provided outside very specific circumstances. Furthermore,…
▽ More
We show that solutions to the popular convex matrix LASSO problem (nuclear-norm--penalized linear least-squares) have low rank under similar assumptions as required by classical low-rank matrix sensing error bounds. Although the purpose of the nuclear norm penalty is to promote low solution rank, a proof has not yet (to our knowledge) been provided outside very specific circumstances. Furthermore, we show that this result has significant theoretical consequences for nonconvex rank-constrained optimization approaches. Specifically, we show that if (a) the ground truth matrix has low rank, (b) the (linear) measurement operator has the matrix restricted isometry property (RIP), and (c) the measurement error is small enough relative to the nuclear norm penalty, then the (unique) LASSO solution has rank (approximately) bounded by that of the ground truth. From this, we show (a) that a low-rank--projected proximal gradient descent algorithm will converge linearly to the LASSO solution from any initialization, and (b) that the nonconvex landscape of the low-rank Burer-Monteiro--factored problem formulation is benign in the sense that all second-order critical points are globally optimal and yield the LASSO solution.
△ Less
Submitted 22 April, 2025; v1 submitted 19 April, 2024;
originally announced April 2024.
-
Benign landscapes of low-dimensional relaxations for orthogonal synchronization on general graphs
Authors:
Andrew D. McRae,
Nicolas Boumal
Abstract:
Orthogonal group synchronization is the problem of estimating $n$ elements $Z_1, \ldots, Z_n$ from the $r \times r$ orthogonal group given some relative measurements $R_{ij} \approx Z_i^{}Z_j^{-1}$. The least-squares formulation is nonconvex. To avoid its local minima, a Shor-type convex relaxation squares the dimension of the optimization problem from $O(n)$ to $O(n^2)$. Alternatively, Burer--Mon…
▽ More
Orthogonal group synchronization is the problem of estimating $n$ elements $Z_1, \ldots, Z_n$ from the $r \times r$ orthogonal group given some relative measurements $R_{ij} \approx Z_i^{}Z_j^{-1}$. The least-squares formulation is nonconvex. To avoid its local minima, a Shor-type convex relaxation squares the dimension of the optimization problem from $O(n)$ to $O(n^2)$. Alternatively, Burer--Monteiro-type nonconvex relaxations have generic landscape guarantees at dimension $O(n^{3/2})$. For smaller relaxations, the problem structure matters. It has been observed in the robotics literature that, for SLAM problems, it seems sufficient to increase the dimension by a small constant multiple over the original. We partially explain this. This also has implications for Kuramoto oscillators.
Specifically, we minimize the least-squares cost function in terms of estimators $Y_1, \ldots, Y_n$. For $p \geq r$, each $Y_i$ is relaxed to the Stiefel manifold $\mathrm{St}(r, p)$ of $r \times p$ matrices with orthonormal rows. The available measurements implicitly define a (connected) graph $G$ on $n$ vertices. In the noiseless case, we show that, for all connected graphs $G$, second-order critical points are globally optimal as soon as $p \geq r+2$. (This implies that Kuramoto oscillators on $\mathrm{St}(r, p)$ synchronize for all $p \geq r + 2$.) This result is the best possible for general graphs; the previous best known result requires $2p \geq 3(r + 1)$. For $p > r + 2$, our result is robust to modest amounts of noise (depending on $p$ and $G$). Our proof uses a novel randomized choice of tangent direction to prove (near-)optimality of second-order critical points. Finally, we partially extend our noiseless landscape results to the complex case (unitary group); we show that there are no spurious local minima when $2p \geq 3r$.
△ Less
Submitted 8 February, 2024; v1 submitted 6 July, 2023;
originally announced July 2023.
-
Harmless interpolation in regression and classification with structured features
Authors:
Andrew D. McRae,
Santhosh Karnik,
Mark A. Davenport,
Vidya Muthukumar
Abstract:
Overparametrized neural networks tend to perfectly fit noisy training data yet generalize well on test data. Inspired by this empirical observation, recent work has sought to understand this phenomenon of benign overfitting or harmless interpolation in the much simpler linear model. Previous theoretical work critically assumes that either the data features are statistically independent or the inpu…
▽ More
Overparametrized neural networks tend to perfectly fit noisy training data yet generalize well on test data. Inspired by this empirical observation, recent work has sought to understand this phenomenon of benign overfitting or harmless interpolation in the much simpler linear model. Previous theoretical work critically assumes that either the data features are statistically independent or the input data is high-dimensional; this precludes general nonparametric settings with structured feature maps. In this paper, we present a general and flexible framework for upper bounding regression and classification risk in a reproducing kernel Hilbert space. A key contribution is that our framework describes precise sufficient conditions on the data Gram matrix under which harmless interpolation occurs. Our results recover prior independent-features results (with a much simpler analysis), but they furthermore show that harmless interpolation can occur in more general settings such as features that are a bounded orthonormal system. Furthermore, our results show an asymptotic separation between classification and regression performance in a manner that was previously only shown for Gaussian features.
△ Less
Submitted 21 February, 2022; v1 submitted 9 November, 2021;
originally announced November 2021.
-
Optimal convex lifted sparse phase retrieval and PCA with an atomic matrix norm regularizer
Authors:
Andrew D. McRae,
Justin Romberg,
Mark A. Davenport
Abstract:
We present novel analysis and algorithms for solving sparse phase retrieval and sparse principal component analysis (PCA) with convex lifted matrix formulations. The key innovation is a new mixed atomic matrix norm that, when used as regularization, promotes low-rank matrices with sparse factors. We show that convex programs with this atomic norm as a regularizer provide near-optimal sample comple…
▽ More
We present novel analysis and algorithms for solving sparse phase retrieval and sparse principal component analysis (PCA) with convex lifted matrix formulations. The key innovation is a new mixed atomic matrix norm that, when used as regularization, promotes low-rank matrices with sparse factors. We show that convex programs with this atomic norm as a regularizer provide near-optimal sample complexity and error rate guarantees for sparse phase retrieval and sparse PCA. While we do not know how to solve the convex programs exactly with an efficient algorithm, for the phase retrieval case we carefully analyze the program and its dual and thereby derive a practical heuristic algorithm. We show empirically that this practical algorithm performs similarly to existing state-of-the-art algorithms.
△ Less
Submitted 26 September, 2022; v1 submitted 8 November, 2021;
originally announced November 2021.
-
Low-rank matrix completion and denoising under Poisson noise
Authors:
Andrew D. McRae,
Mark A. Davenport
Abstract:
This paper considers the problem of estimating a low-rank matrix from the observation of all or a subset of its entries in the presence of Poisson noise. When we observe all entries, this is a problem of matrix denoising; when we observe only a subset of the entries, this is a problem of matrix completion. In both cases, we exploit an assumption that the underlying matrix is low-rank. Specifically…
▽ More
This paper considers the problem of estimating a low-rank matrix from the observation of all or a subset of its entries in the presence of Poisson noise. When we observe all entries, this is a problem of matrix denoising; when we observe only a subset of the entries, this is a problem of matrix completion. In both cases, we exploit an assumption that the underlying matrix is low-rank. Specifically, we analyze several estimators, including a constrained nuclear-norm minimization program, nuclear-norm regularized least squares, and a nonconvex constrained low-rank optimization problem. We show that for all three estimators, with high probability, we have an upper error bound (in the Frobenius norm error metric) that depends on the matrix rank, the fraction of the elements observed, and maximal row and column sums of the true matrix. We furthermore show that the above results are minimax optimal (within a universal constant) in classes of matrices with low rank and bounded row and column sums. We also extend these results to handle the case of matrix multinomial denoising and completion.
△ Less
Submitted 30 April, 2020; v1 submitted 11 July, 2019;
originally announced July 2019.