-
Fast randomized least-squares solvers can be just as accurate and stable as classical direct solvers
Authors:
Ethan N. Epperly,
Maike Meier,
Yuji Nakatsukasa
Abstract:
One of the greatest success stories of randomized algorithms for linear algebra has been the development of fast, randomized algorithms for highly overdetermined linear least-squares problems. However, none of the existing algorithms is backward stable, preventing them from being deployed as drop-in replacements for existing QR-based solvers. This paper introduces sketch-and-precondition with iter…
▽ More
One of the greatest success stories of randomized algorithms for linear algebra has been the development of fast, randomized algorithms for highly overdetermined linear least-squares problems. However, none of the existing algorithms is backward stable, preventing them from being deployed as drop-in replacements for existing QR-based solvers. This paper introduces sketch-and-precondition with iterative refinement (SPIR) and FOSSILS, two provably backward stable randomized least-squares solvers. SPIR and FOSSILS combine iterative refinement with a preconditioned iterative method applied to the normal equations and converge at the same rate as existing randomized least-squares solvers. This work offers the promise of incorporating randomized least-squares solvers into existing software libraries while maintaining the same level of accuracy and stability as classical solvers.
△ Less
Submitted 23 July, 2024; v1 submitted 5 June, 2024;
originally announced June 2024.
-
Are sketch-and-precondition least squares solvers numerically stable?
Authors:
Maike Meier,
Yuji Nakatsukasa,
Alex Townsend,
Marcus Webb
Abstract:
Sketch-and-precondition techniques are efficient and popular for solving large least squares (LS) problems of the form $Ax=b$ with $A\in\mathbb{R}^{m\times n}$ and $m\gg n$. This is where $A$ is ``sketched" to a smaller matrix $SA$ with $S\in\mathbb{R}^{\lceil cn\rceil\times m}$ for some constant $c>1$ before an iterative LS solver computes the solution to $Ax=b$ with a right preconditioner $P$, w…
▽ More
Sketch-and-precondition techniques are efficient and popular for solving large least squares (LS) problems of the form $Ax=b$ with $A\in\mathbb{R}^{m\times n}$ and $m\gg n$. This is where $A$ is ``sketched" to a smaller matrix $SA$ with $S\in\mathbb{R}^{\lceil cn\rceil\times m}$ for some constant $c>1$ before an iterative LS solver computes the solution to $Ax=b$ with a right preconditioner $P$, where $P$ is constructed from $SA$. Prominent sketch-and-precondition LS solvers are Blendenpik and LSRN. We show that the sketch-and-precondition technique in its most commonly used form is not numerically stable for ill-conditioned LS problems. For provable and practical backward stability and optimal residuals, we suggest using an unpreconditioned iterative LS solver on $(AP)z=b$ with $x=Pz$. Provided the condition number of $A$ is smaller than the reciprocal of the unit round-off, we show that this modification ensures that the computed solution has a backward error comparable to the iterative LS solver applied to a well-conditioned matrix. Using smoothed analysis, we model floating-point rounding errors to argue that our modification is expected to compute a backward stable solution even for arbitrarily ill-conditioned LS problems. Additionally, we provide experimental evidence that using the sketch-and-solve solution as a starting vector in sketch-and-precondition algorithms (as suggested by Rokhlin and Tygert in 2008) should be highly preferred over the zero vector. The initialization often results in much more accurate solutions -- albeit not always backward stable ones.
△ Less
Submitted 10 November, 2023; v1 submitted 14 February, 2023;
originally announced February 2023.
-
Randomized algorithms for Tikhonov regularization in linear least squares
Authors:
Maike Meier,
Yuji Nakatsukasa
Abstract:
We describe two algorithms to efficiently solve regularized linear least squares systems based on sketching. The algorithms compute preconditioners for $\min \|Ax-b\|^2_2 + λ\|x\|^2_2$, where $A\in\mathbb{R}^{m\times n}$ and $λ>0$ is a regularization parameter, such that LSQR converges in $\mathcal{O}(\log(1/ε))$ iterations for $ε$ accuracy. We focus on the context where the optimal regularization…
▽ More
We describe two algorithms to efficiently solve regularized linear least squares systems based on sketching. The algorithms compute preconditioners for $\min \|Ax-b\|^2_2 + λ\|x\|^2_2$, where $A\in\mathbb{R}^{m\times n}$ and $λ>0$ is a regularization parameter, such that LSQR converges in $\mathcal{O}(\log(1/ε))$ iterations for $ε$ accuracy. We focus on the context where the optimal regularization parameter is unknown, and the system must be solved for a number of parameters $λ$. Our algorithms are applicable in both the underdetermined $m\ll n$ and the overdetermined $m\gg n$ setting. Firstly, we propose a Cholesky-based sketch-to-precondition algorithm that uses a `partly exact' sketch, and only requires one sketch for a set of $N$ regularization parameters $λ$. The complexity of solving for $N$ parameters is $\mathcal{O}(mn\log(\max(m,n)) +N(\min(m,n)^3 + mn\log(1/ε)))$. Secondly, we introduce an algorithm that uses a sketch of size $\mathcal{O}(\text{sd}_λ(A))$ for the case where the statistical dimension $\text{sd}_λ(A)\ll\min(m,n)$. The scheme we propose does not require the computation of the Gram matrix, resulting in a more stable scheme than existing algorithms in this context. We can solve for $N$ values of $λ_i$ in $\mathcal{O}(mn\log(\max(m,n)) + \min(m,n)\,\text{sd}_{\minλ_i}(A)^2 + Nmn\log(1/ε))$ operations.
△ Less
Submitted 14 March, 2022;
originally announced March 2022.
-
Fast randomized numerical rank estimation for numerically low-rank matrices
Authors:
Maike Meier,
Yuji Nakatsukasa
Abstract:
Matrices with low-rank structure are ubiquitous in scientific computing. Choosing an appropriate rank is a key step in many computational algorithms that exploit low-rank structure. However, estimating the rank has been done largely in an ad-hoc fashion in large-scale settings. In this work we develop a randomized algorithm for estimating the numerical rank of a (numerically low-rank) matrix. The…
▽ More
Matrices with low-rank structure are ubiquitous in scientific computing. Choosing an appropriate rank is a key step in many computational algorithms that exploit low-rank structure. However, estimating the rank has been done largely in an ad-hoc fashion in large-scale settings. In this work we develop a randomized algorithm for estimating the numerical rank of a (numerically low-rank) matrix. The algorithm is based on sketching the matrix with random matrices from both left and right; the key fact is that with high probability, the sketches preserve the orders of magnitude of the leading singular values. We prove a result on the accuracy of the sketched singular values and show that gaps in the spectrum are detected. For an $m\times n$ $(m\geq n)$ matrix of numerical rank $r$, the algorithm runs with complexity $O(mn\log n+r^3)$, or less for structured matrices. The steps in the algorithm are required as a part of many low-rank algorithms, so the additional work required to estimate the rank can be even smaller in practice. Numerical experiments illustrate the speed and robustness of our rank estimator.
△ Less
Submitted 5 January, 2024; v1 submitted 16 May, 2021;
originally announced May 2021.
-
Mathematical models for fake news
Authors:
Dorje C. Brody,
David M. Meier
Abstract:
Over the past decade it has become evident that intentional disinformation in the political context -- so-called fake news -- is a danger to democracy. However, until now there has been no clear understanding of how to define fake news, much less how to model it. This paper addresses both of these issues. A definition of fake news is given, and two approaches for the modelling of fake news and its…
▽ More
Over the past decade it has become evident that intentional disinformation in the political context -- so-called fake news -- is a danger to democracy. However, until now there has been no clear understanding of how to define fake news, much less how to model it. This paper addresses both of these issues. A definition of fake news is given, and two approaches for the modelling of fake news and its impact in elections and referendums are introduced. The first approach, based on the idea of a representative voter, is shown to be suitable for obtaining a qualitative understanding of phenomena associated with fake news at a macroscopic level. The second approach, based on the idea of an election microstructure, describes the collective behaviour of the electorate by modelling the preferences of individual voters. It is shown through a simulation study that the mere knowledge that fake news may be in circulation goes a long way towards mitigating the impact of fake news.
△ Less
Submitted 30 November, 2021; v1 submitted 4 September, 2018;
originally announced September 2018.
-
Lévy-Vasicek Models and the Long-Bond Return Process
Authors:
Dorje C. Brody,
Lane P. Hughston,
David M. Meier
Abstract:
The classical derivation of the well-known Vasicek model for interest rates is reformulated in terms of the associated pricing kernel. An advantage of the pricing kernel method is that it allows one to generalize the construction to the Lévy-Vasicek case, avoiding issues of market incompleteness. In the Lévy-Vasicek model the short rate is taken in the real-world measure to be a mean-reverting pro…
▽ More
The classical derivation of the well-known Vasicek model for interest rates is reformulated in terms of the associated pricing kernel. An advantage of the pricing kernel method is that it allows one to generalize the construction to the Lévy-Vasicek case, avoiding issues of market incompleteness. In the Lévy-Vasicek model the short rate is taken in the real-world measure to be a mean-reverting process with a general one-dimensional Lévy driver admitting exponential moments. Expressions are obtained for the Lévy-Vasicek bond prices and interest rates, along with a formula for the return on a unit investment in the long bond, defined by $L_t = \lim_{T \rightarrow \infty} P_{tT} / P_{0T}$, where $P_{tT}$ is the price at time $t$ of a $T$-maturity discount bond. We show that the pricing kernel of a Lévy-Vasicek model is uniformly integrable if and only if the long rate of interest is strictly positive.
△ Less
Submitted 13 September, 2016; v1 submitted 23 August, 2016;
originally announced August 2016.
-
A new strategy for Robbins' problem of optimal stopping
Authors:
Martin Meier,
Leopold Sögner
Abstract:
In this article we study the expected rank problem under full information. Our approach uses the planar Poisson approach from Gnedin (2007) to derive the expected rank of a stopping rule that is one of the simplest non-trivial examples combining rank dependent rules with threshold rules. This rule attains an expected rank lower than the best upper bounds obtained in the literature so far, in parti…
▽ More
In this article we study the expected rank problem under full information. Our approach uses the planar Poisson approach from Gnedin (2007) to derive the expected rank of a stopping rule that is one of the simplest non-trivial examples combining rank dependent rules with threshold rules. This rule attains an expected rank lower than the best upper bounds obtained in the literature so far, in particular we obtain an expected rank of 2.32614.
△ Less
Submitted 10 June, 2016;
originally announced June 2016.
-
A Riemannian approach to Randers geodesics
Authors:
Dorje C. Brody,
Gary W. Gibbons,
David M. Meier
Abstract:
In certain circumstances tools of Riemannian geometry are sufficient to address questions arising in the more general Finslerian context. We show that one such instance presents itself in the characterisation of geodesics in Randers spaces of constant flag curvature. To achieve a simple, Riemannian derivation of this special family of curves, we exploit the connection between Randers spaces and th…
▽ More
In certain circumstances tools of Riemannian geometry are sufficient to address questions arising in the more general Finslerian context. We show that one such instance presents itself in the characterisation of geodesics in Randers spaces of constant flag curvature. To achieve a simple, Riemannian derivation of this special family of curves, we exploit the connection between Randers spaces and the Zermelo problem of time-optimal navigation in the presence of background fields. The characterisation of geodesics is then proven by generalising an intuitive argument developed recently for the solution of the quantum Zermelo problem.
△ Less
Submitted 1 September, 2016; v1 submitted 29 July, 2015;
originally announced July 2015.
-
Weak dual pairs and jetlet methods for ideal incompressible fluid models in $n\geq 2$ dimensions
Authors:
C. J. Cotter,
J. Eldering,
D. D. Holm,
H. O. Jacobs,
D. M. Meier
Abstract:
We review the role of dual pairs in mechanics and use them to derive particle-like solutions to regularized incompressible fluid systems. In our case we have a dual pair resulting from the action of diffeomorphisms on point particles (essentially by moving the points). We then augment our dual pair by considering the action of diffeomorphisms on Taylor series, also known as jets. The augmented wea…
▽ More
We review the role of dual pairs in mechanics and use them to derive particle-like solutions to regularized incompressible fluid systems. In our case we have a dual pair resulting from the action of diffeomorphisms on point particles (essentially by moving the points). We then augment our dual pair by considering the action of diffeomorphisms on Taylor series, also known as jets. The augmented weak dual pairs induce a hierarchy of particle-like solutions and conservation laws with particles carrying a copy of a jet group. We call these augmented particles jetlets. The jet groups serve as finite-dimensional models of the diffeomorphism group itself, and so the jetlet particles serve as a finite-dimensional model of the self-similarity exhibited by ideal incompressible fluids. The conservation law associated to jetlet solutions is shown to be a shadow of Kelvin's circulation theorem. Finally, we study the dynamics of infinite time particle mergers. We prove that two merging particles at the zeroth level in the hierarchy yield dynamics which asymptotically approach that of a single particle in the first level in the hierarchy. This merging behavior is then verified numerically as well as the exchange of angular momentum which must occur during a near collision of two particles. The resulting particle-like solutions suggest a new class of meshless methods which work in dimensions $n \geq 2$ and which exhibit a shadow of Kelvin's circulation theorem. More broadly, this provides one of the first finite-dimensional models of self-similarity in ideal fluids.
△ Less
Submitted 12 July, 2016; v1 submitted 26 March, 2015;
originally announced March 2015.
-
Time-optimal navigation through quantum wind
Authors:
Dorje C. Brody,
Gary W. Gibbons,
David M. Meier
Abstract:
The quantum navigation problem of finding the time-optimal control Hamiltonian that transports a given initial state to a target state through quantum wind, that is, under the influence of external fields or potentials, is analysed. By lifting the problem from the state space to the space of unitary gates realising the required task, we are able to deduce the form of the solution to the problem by…
▽ More
The quantum navigation problem of finding the time-optimal control Hamiltonian that transports a given initial state to a target state through quantum wind, that is, under the influence of external fields or potentials, is analysed. By lifting the problem from the state space to the space of unitary gates realising the required task, we are able to deduce the form of the solution to the problem by deriving a universal quantum speed limit. The expression thus obtained indicates that further simplifications of this apparently difficult problem are possible if we switch to the interaction picture of quantum mechanics. A complete solution to the navigation problem for an arbitrary quantum system is then obtained, and the behaviour of the solution is illustrated in the case of a two-level system.
△ Less
Submitted 19 February, 2015; v1 submitted 24 October, 2014;
originally announced October 2014.
-
Inexact trajectory planning and inverse problems in the Hamilton--Pontryagin framework
Authors:
Christopher L. Burnett,
Darryl D. Holm,
David M. Meier
Abstract:
We study a trajectory-planning problem whose solution path evolves by means of a Lie group action and passes near a designated set of target positions at particular times. This is a higher-order variational problem in optimal control, motivated by potential applications in computational anatomy and quantum control. Reduction by symmetry in such problems naturally summons methods from Lie group the…
▽ More
We study a trajectory-planning problem whose solution path evolves by means of a Lie group action and passes near a designated set of target positions at particular times. This is a higher-order variational problem in optimal control, motivated by potential applications in computational anatomy and quantum control. Reduction by symmetry in such problems naturally summons methods from Lie group theory and Riemannian geometry. A geometrically illuminating form of the Euler-Lagrange equations is obtained from a higher-order Hamilton-Pontryagin variational formulation. In this context, the previously known node equations are recovered with a new interpretation as Legendre-Ostrogradsky momenta possessing certain conservation properties. Three example applications are discussed as well as a numerical integration scheme that follows naturally from the Hamilton-Pontryagin principle and preserves the geometric properties of the continuous-time solution.
△ Less
Submitted 12 April, 2013;
originally announced April 2013.
-
Quantum splines
Authors:
Dorje C. Brody,
Darryl D. Holm,
David M. Meier
Abstract:
A quantum spline is a smooth curve parameterised by time in the space of unitary transformations, whose associated orbit on the space of pure states traverses a designated set of quantum states at designated times, such that the trace norm of the time rate of change of the associated Hamiltonian is minimised. The solution to the quantum spline problem is obtained, and is applied in an example that…
▽ More
A quantum spline is a smooth curve parameterised by time in the space of unitary transformations, whose associated orbit on the space of pure states traverses a designated set of quantum states at designated times, such that the trace norm of the time rate of change of the associated Hamiltonian is minimised. The solution to the quantum spline problem is obtained, and is applied in an example that illustrates quantum control of coherent states. An efficient numerical scheme for computing quantum splines is discussed and implemented in the examples.
△ Less
Submitted 4 September, 2012; v1 submitted 12 June, 2012;
originally announced June 2012.
-
Invariant higher-order variational problems II
Authors:
François Gay-Balmaz,
Darryl D. Holm,
David M. Meier,
Tudor S. Ratiu,
François-Xavier Vialard
Abstract:
Motivated by applications in computational anatomy, we consider a second-order problem in the calculus of variations on object manifolds that are acted upon by Lie groups of smooth invertible transformations. This problem leads to solution curves known as Riemannian cubics on object manifolds that are endowed with normal metrics. The prime examples of such object manifolds are the symmetric spaces…
▽ More
Motivated by applications in computational anatomy, we consider a second-order problem in the calculus of variations on object manifolds that are acted upon by Lie groups of smooth invertible transformations. This problem leads to solution curves known as Riemannian cubics on object manifolds that are endowed with normal metrics. The prime examples of such object manifolds are the symmetric spaces. We characterize the class of cubics on object manifolds that can be lifted horizontally to cubics on the group of transformations. Conversely, we show that certain types of non-horizontal geodesics on the group of transformations project to cubics. Finally, we apply second-order Lagrange--Poincaré reduction to the problem of Riemannian cubics on the group of transformations. This leads to a reduced form of the equations that reveals the obstruction for the projection of a cubic on a transformation group to again be a cubic on its object manifold.
△ Less
Submitted 29 December, 2011;
originally announced December 2011.
-
Geometric integrators for higher-order mechanics on Lie groups
Authors:
Christopher L. Burnett,
Darryl D. Holm,
David M. Meier
Abstract:
This paper develops a structure-preserving numerical integration scheme for a class of higher-order mechanical systems. The dynamics of these systems are governed by invariant variational principles defined on higher-order tangent bundles of Lie groups. The variational principles admit Lagrangians that depend on acceleration, for example. The symmetry reduction method used in the Hamilton--Pontrya…
▽ More
This paper develops a structure-preserving numerical integration scheme for a class of higher-order mechanical systems. The dynamics of these systems are governed by invariant variational principles defined on higher-order tangent bundles of Lie groups. The variational principles admit Lagrangians that depend on acceleration, for example. The symmetry reduction method used in the Hamilton--Pontryagin approach for developing variational integrators of first-order mechanics is extended here to higher order. The paper discusses the general approach and then focuses on the primary example of Riemannian cubics. Higher-order variational integrators are developed both for the discrete-time integration of the initial value problem and for a particular type of trajectory-planning problem. The solution of the discrete trajectory-planning problem for higher-order interpolation among points on the sphere illustrates the approach.
△ Less
Submitted 27 December, 2011;
originally announced December 2011.
-
Invariant higher-order variational problems
Authors:
F. Gay-Balmaz,
D. D. Holm,
D. M. Meier,
T. S. Ratiu,
F. -X. Vialard
Abstract:
We investigate higher-order geometric $k$-splines for template matching on Lie groups. This is motivated by the need to apply diffeomorphic template matching to a series of images, e.g., in longitudinal studies of Computational Anatomy. Our approach formulates Euler-Poincaré theory in higher-order tangent spaces on Lie groups. In particular, we develop the Euler-Poincaré formalism for higher-order…
▽ More
We investigate higher-order geometric $k$-splines for template matching on Lie groups. This is motivated by the need to apply diffeomorphic template matching to a series of images, e.g., in longitudinal studies of Computational Anatomy. Our approach formulates Euler-Poincaré theory in higher-order tangent spaces on Lie groups. In particular, we develop the Euler-Poincaré formalism for higher-order variational problems that are invariant under Lie group transformations. The theory is then applied to higher-order template matching and the corresponding curves on the Lie group of transformations are shown to satisfy higher-order Euler-Poincaré equations. The example of SO(3) for template matching on the sphere is presented explicitly. Various cotangent bundle momentum maps emerge naturally that help organize the formulas. We also present Hamiltonian and Hamilton-Ostrogradsky Lie-Poisson formulations of the higher-order Euler-Poincaré theory for applications on the Hamiltonian side.
△ Less
Submitted 22 December, 2010;
originally announced December 2010.
-
Finitely additive beliefs and universal type spaces
Authors:
Martin Meier
Abstract:
The probabilistic type spaces in the sense of Harsanyi [Management Sci. 14 (1967/68) 159--182, 320--334, 486--502] are the prevalent models used to describe interactive uncertainty. In this paper we examine the existence of a universal type space when beliefs are described by finitely additive probability measures. We find that in the category of all type spaces that satisfy certain measurabilit…
▽ More
The probabilistic type spaces in the sense of Harsanyi [Management Sci. 14 (1967/68) 159--182, 320--334, 486--502] are the prevalent models used to describe interactive uncertainty. In this paper we examine the existence of a universal type space when beliefs are described by finitely additive probability measures. We find that in the category of all type spaces that satisfy certain measurability conditions ($κ$-measurability, for some fixed regular cardinal $κ$), there is a universal type space (i.e., a terminal object) to which every type space can be mapped in a unique beliefs-preserving way. However, by a probabilistic adaption of the elegant sober-drunk example of Heifetz and Samet [Games Econom. Behav. 22 (1998) 260--273] we show that if all subsets of the spaces are required to be measurable, then there is no universal type space.
△ Less
Submitted 28 February, 2006;
originally announced February 2006.