-
Sensor network localization has a benign landscape after low-dimensional relaxation
Authors:
Christopher Criscitiello,
Andrew D. McRae,
Quentin Rebjock,
Nicolas Boumal
Abstract:
We consider the sensor network localization problem, also known as multidimensional scaling or Euclidean distance matrix completion. Given a ground truth configuration of $n$ points in $\mathbb{R}^\ell$, we observe a subset of the pairwise distances and aim to recover the underlying configuration (up to rigid transformations). We show with a simple counterexample that the associated optimization p…
▽ More
We consider the sensor network localization problem, also known as multidimensional scaling or Euclidean distance matrix completion. Given a ground truth configuration of $n$ points in $\mathbb{R}^\ell$, we observe a subset of the pairwise distances and aim to recover the underlying configuration (up to rigid transformations). We show with a simple counterexample that the associated optimization problem is nonconvex and may admit spurious local minimizers, even when all distances are known. Yet, inspired by numerical experiments, we argue that all second-order critical points become global minimizers when the problem is relaxed by optimizing over configurations in dimension $k > \ell$. Specifically, we show this for two settings, both when all pairwise distances are known: (1) for arbitrary ground truth points, and $k= O(\sqrt{\ell n})$, and: (2) for isotropic random ground truth points, and $k = O(\ell + \log n)$. To prove these results, we identify and exploit key properties of the linear map which sends inner products to squared distances.
△ Less
Submitted 21 July, 2025;
originally announced July 2025.
-
Horospherically Convex Optimization on Hadamard Manifolds Part I: Analysis and Algorithms
Authors:
Christopher Criscitiello,
Jungbin Kim
Abstract:
Geodesic convexity (g-convexity) is a natural generalization of convexity to Riemannian manifolds. However, g-convexity lacks many desirable properties satisfied by Euclidean convexity. For instance, the natural notions of half-spaces and affine functions are themselves not g-convex. Moreover, recent studies have shown that the oracle complexity of geodesically convex optimization necessarily depe…
▽ More
Geodesic convexity (g-convexity) is a natural generalization of convexity to Riemannian manifolds. However, g-convexity lacks many desirable properties satisfied by Euclidean convexity. For instance, the natural notions of half-spaces and affine functions are themselves not g-convex. Moreover, recent studies have shown that the oracle complexity of geodesically convex optimization necessarily depends on the curvature of the manifold (Criscitiello and Boumal, 2022; Criscitiello and Boumal, 2023; Hamilton and Moitra, 2021), a computational bottleneck for several problems, e.g., tensor scaling. Recently, Lewis et al. (2024) addressed this challenge by proving curvature-independent convergence of subgradient descent, assuming horospherical convexity of the objective's sublevel sets. Using a similar idea, we introduce a generalization of convex functions to Hadamard manifolds, utilizing horoballs and Busemann functions as building blocks (as proxies for half-spaces and affine functions). We refer to this new notion as horospherical convexity (h-convexity). We provide algorithms for both nonsmooth and smooth h-convex optimization, which have curvature-independent guarantees exactly matching those from Euclidean space; this includes generalizations of subgradient descent and Nesterov's accelerated method. Motivated by applications, we extend these algorithms and their convergence rates to minimizing a sum of horospherically convex functions, assuming access to a weighted-Fréchet-mean oracle.
△ Less
Submitted 22 May, 2025;
originally announced May 2025.
-
Synchronization on circles and spheres with nonlinear interactions
Authors:
Christopher Criscitiello,
Quentin Rebjock,
Andrew D. McRae,
Nicolas Boumal
Abstract:
We consider the dynamics of $n$ points on a sphere in $\mathbb{R}^d$ ($d \geq 2$) which attract each other according to a function $\varphi$ of their inner products. When $\varphi$ is linear ($\varphi(t) = t$), the points converge to a common value (i.e., synchronize) in various connectivity scenarios: this is part of classical work on Kuramoto oscillator networks. When $\varphi$ is exponential (…
▽ More
We consider the dynamics of $n$ points on a sphere in $\mathbb{R}^d$ ($d \geq 2$) which attract each other according to a function $\varphi$ of their inner products. When $\varphi$ is linear ($\varphi(t) = t$), the points converge to a common value (i.e., synchronize) in various connectivity scenarios: this is part of classical work on Kuramoto oscillator networks. When $\varphi$ is exponential ($\varphi(t) = e^{βt}$), these dynamics correspond to a limit of how idealized transformers process data, as described by Geshkovski et al. (2024). Accordingly, they ask whether synchronization occurs for exponential $\varphi$.
In the context of consensus for multi-agent control, Markdahl et al. (2018) show that for $d \geq 3$ (spheres), if the interaction graph is connected and $\varphi$ is increasing and convex, then the system synchronizes. What is the situation on circles ($d=2$)? First, we show that $\varphi$ being increasing and convex is no longer sufficient. Then we identify a new condition (that the Taylor coefficients of $\varphi'$ are decreasing) under which we do have synchronization on the circle. In so doing, we provide some answers to the open problems posed by Geshkovski et al. (2024).
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Open Problem: Polynomial linearly-convergent method for geodesically convex optimization?
Authors:
Christopher Criscitiello,
David Martínez-Rubio,
Nicolas Boumal
Abstract:
Let $f \colon \mathcal{M} \to \mathbb{R}$ be a Lipschitz and geodesically convex function defined on a $d$-dimensional Riemannian manifold $\mathcal{M}$. Does there exist a first-order deterministic algorithm which (a) uses at most $O(\mathrm{poly}(d) \log(ε^{-1}))$ subgradient queries to find a point with target accuracy $ε$, and (b) requires only $O(\mathrm{poly}(d))$ arithmetic operations per q…
▽ More
Let $f \colon \mathcal{M} \to \mathbb{R}$ be a Lipschitz and geodesically convex function defined on a $d$-dimensional Riemannian manifold $\mathcal{M}$. Does there exist a first-order deterministic algorithm which (a) uses at most $O(\mathrm{poly}(d) \log(ε^{-1}))$ subgradient queries to find a point with target accuracy $ε$, and (b) requires only $O(\mathrm{poly}(d))$ arithmetic operations per query? In convex optimization, the classical ellipsoid method achieves this. After detailing related work, we provide an ellipsoid-like algorithm with query complexity $O(d^2 \log^2(ε^{-1}))$ and per-query complexity $O(d^2)$ for the limited case where $\mathcal{M}$ has constant curvature (hemisphere or hyperbolic space). We then detail possible approaches and corresponding obstacles for designing an ellipsoid-like method for general Riemannian manifolds.
△ Less
Submitted 24 July, 2023;
originally announced July 2023.
-
Curvature and complexity: Better lower bounds for geodesically convex optimization
Authors:
Christopher Criscitiello,
Nicolas Boumal
Abstract:
We study the query complexity of geodesically convex (g-convex) optimization on a manifold. To isolate the effect of that manifold's curvature, we primarily focus on hyperbolic spaces. In a variety of settings (smooth or not; strongly g-convex or not; high- or low-dimensional), known upper bounds worsen with curvature. It is natural to ask whether this is warranted, or an artifact.
For many such…
▽ More
We study the query complexity of geodesically convex (g-convex) optimization on a manifold. To isolate the effect of that manifold's curvature, we primarily focus on hyperbolic spaces. In a variety of settings (smooth or not; strongly g-convex or not; high- or low-dimensional), known upper bounds worsen with curvature. It is natural to ask whether this is warranted, or an artifact.
For many such settings, we propose a first set of lower bounds which indeed confirm that (negative) curvature is detrimental to complexity. To do so, we build on recent lower bounds (Hamilton and Moitra, 2021; Criscitiello and Boumal, 2022) for the particular case of smooth, strongly g-convex optimization. Using a number of techniques, we also secure lower bounds which capture dependence on condition number and optimality gap, which was not previously the case.
We suspect these bounds are not optimal. We conjecture optimal ones, and support them with a matching lower bound for a class of algorithms which includes subgradient descent, and a lower bound for a related game. Lastly, to pinpoint the difficulty of proving lower bounds, we study how negative curvature influences (and sometimes obstructs) interpolation with g-convex functions.
△ Less
Submitted 24 July, 2023; v1 submitted 5 June, 2023;
originally announced June 2023.
-
Accelerated Methods for Riemannian Min-Max Optimization Ensuring Bounded Geometric Penalties
Authors:
David Martínez-Rubio,
Christophe Roux,
Christopher Criscitiello,
Sebastian Pokutta
Abstract:
In this work, we study optimization problems of the form $\min_x \max_y f(x, y)$, where $f(x, y)$ is defined on a product Riemannian manifold $\mathcal{M} \times \mathcal{N}$ and is $μ_x$-strongly geodesically convex (g-convex) in $x$ and $μ_y$-strongly g-concave in $y$, for $μ_x, μ_y \geq 0$. We design accelerated methods when $f$ is $(L_x, L_y, L_{xy})$-smooth and $\mathcal{M}$, $\mathcal{N}$ ar…
▽ More
In this work, we study optimization problems of the form $\min_x \max_y f(x, y)$, where $f(x, y)$ is defined on a product Riemannian manifold $\mathcal{M} \times \mathcal{N}$ and is $μ_x$-strongly geodesically convex (g-convex) in $x$ and $μ_y$-strongly g-concave in $y$, for $μ_x, μ_y \geq 0$. We design accelerated methods when $f$ is $(L_x, L_y, L_{xy})$-smooth and $\mathcal{M}$, $\mathcal{N}$ are Hadamard. To that aim we introduce new g-convex optimization results, of independent interest: we show global linear convergence for metric-projected Riemannian gradient descent and improve existing accelerated methods by reducing geometric constants. Additionally, we complete the analysis of two previous works applying to the Riemannian min-max case by removing an assumption about iterates staying in a pre-specified compact set.
△ Less
Submitted 30 October, 2023; v1 submitted 25 May, 2023;
originally announced May 2023.
-
Negative curvature obstructs acceleration for strongly geodesically convex optimization, even with exact first-order oracles
Authors:
Christopher Criscitiello,
Nicolas Boumal
Abstract:
Hamilton and Moitra (2021) showed that, in certain regimes, it is not possible to accelerate Riemannian gradient descent in the hyperbolic plane if we restrict ourselves to algorithms which make queries in a (large) bounded domain and which receive gradients and function values corrupted by a (small) amount of noise. We show that acceleration remains unachievable for any deterministic algorithm wh…
▽ More
Hamilton and Moitra (2021) showed that, in certain regimes, it is not possible to accelerate Riemannian gradient descent in the hyperbolic plane if we restrict ourselves to algorithms which make queries in a (large) bounded domain and which receive gradients and function values corrupted by a (small) amount of noise. We show that acceleration remains unachievable for any deterministic algorithm which receives exact gradient and function-value information (unbounded queries, no noise). Our results hold for the classes of strongly and nonstrongly geodesically convex functions, and for a large class of Hadamard manifolds including hyperbolic spaces and the symmetric space $\mathrm{SL}(n) / \mathrm{SO}(n)$ of positive definite $n \times n$ matrices of determinant one. This cements a surprising gap between the complexity of convex optimization and geodesically convex optimization: for hyperbolic spaces, Riemannian gradient descent is optimal on the class of smooth and and strongly geodesically convex functions, in the regime where the condition number scales with the radius of the optimization domain. The key idea for proving the lower bound consists of perturbing the hard functions of Hamilton and Moitra (2021) with sums of bump functions chosen by a resisting oracle.
△ Less
Submitted 8 June, 2023; v1 submitted 25 November, 2021;
originally announced November 2021.
-
An accelerated first-order method for non-convex optimization on manifolds
Authors:
Christopher Criscitiello,
Nicolas Boumal
Abstract:
We describe the first gradient methods on Riemannian manifolds to achieve accelerated rates in the non-convex case. Under Lipschitz assumptions on the Riemannian gradient and Hessian of the cost function, these methods find approximate first-order critical points faster than regular gradient descent. A randomized version also finds approximate second-order critical points. Both the algorithms and…
▽ More
We describe the first gradient methods on Riemannian manifolds to achieve accelerated rates in the non-convex case. Under Lipschitz assumptions on the Riemannian gradient and Hessian of the cost function, these methods find approximate first-order critical points faster than regular gradient descent. A randomized version also finds approximate second-order critical points. Both the algorithms and their analyses build extensively on existing work in the Euclidean case. The basic operation consists in running the Euclidean accelerated gradient descent method (appropriately safe-guarded against non-convexity) in the current tangent space, then moving back to the manifold and repeating. This requires lifting the cost function from the manifold to the tangent space, which can be done for example through the Riemannian exponential map. For this approach to succeed, the lifted cost function (called the pullback) must retain certain Lipschitz properties. As a contribution of independent interest, we prove precise claims to that effect, with explicit constants. Those claims are affected by the Riemannian curvature of the manifold, which in turn affects the worst-case complexity bounds for our optimization algorithms.
△ Less
Submitted 25 November, 2021; v1 submitted 5 August, 2020;
originally announced August 2020.
-
Efficiently escaping saddle points on manifolds
Authors:
Chris Criscitiello,
Nicolas Boumal
Abstract:
Smooth, non-convex optimization problems on Riemannian manifolds occur in machine learning as a result of orthonormality, rank or positivity constraints. First- and second-order necessary optimality conditions state that the Riemannian gradient must be zero, and the Riemannian Hessian must be positive semidefinite. Generalizing Jin et al.'s recent work on perturbed gradient descent (PGD) for optim…
▽ More
Smooth, non-convex optimization problems on Riemannian manifolds occur in machine learning as a result of orthonormality, rank or positivity constraints. First- and second-order necessary optimality conditions state that the Riemannian gradient must be zero, and the Riemannian Hessian must be positive semidefinite. Generalizing Jin et al.'s recent work on perturbed gradient descent (PGD) for optimization on linear spaces [How to Escape Saddle Points Efficiently (2017), Stochastic Gradient Descent Escapes Saddle Points Efficiently (2019)], we propose a version of perturbed Riemannian gradient descent (PRGD) to show that necessary optimality conditions can be met approximately with high probability, without evaluating the Hessian. Specifically, for an arbitrary Riemannian manifold $\mathcal{M}$ of dimension $d$, a sufficiently smooth (possibly non-convex) objective function $f$, and under weak conditions on the retraction chosen to move on the manifold, with high probability, our version of PRGD produces a point with gradient smaller than $ε$ and Hessian within $\sqrtε$ of being positive semidefinite in $O((\log{d})^4 / ε^{2})$ gradient queries. This matches the complexity of PGD in the Euclidean case. Crucially, the dependence on dimension is low. This matters for large-scale applications including PCA and low-rank matrix completion, which both admit natural formulations on manifolds. The key technical idea is to generalize PRGD with a distinction between two types of gradient steps: "steps on the manifold" and "perturbed steps in a tangent space of the manifold." Ultimately, this distinction makes it possible to extend Jin et al.'s analysis seamlessly.
△ Less
Submitted 22 October, 2019; v1 submitted 10 June, 2019;
originally announced June 2019.