-
Mixed-precision algorithms for solving the Sylvester matrix equation
Authors:
Andrii Dmytryshyn,
Massimiliano Fasi,
Nicholas J. Higham,
Xiaobo Liu
Abstract:
We consider the solution of the general Sylvester equation $AX+XB=C$ in mixed precision. First, we investigate the use of GMRES-based iterative refinement (GMRES-IR) to solve the equation using implicitly its Kronecker product form: we propose an efficient scheme to use the Schur factors of the coefficient matrices as preconditioners, but we demonstrate that this approach is not suitable in the ca…
▽ More
We consider the solution of the general Sylvester equation $AX+XB=C$ in mixed precision. First, we investigate the use of GMRES-based iterative refinement (GMRES-IR) to solve the equation using implicitly its Kronecker product form: we propose an efficient scheme to use the Schur factors of the coefficient matrices as preconditioners, but we demonstrate that this approach is not suitable in the case of the Sylvester equation. By revisiting a stationary iteration for linear systems, we therefore derive a new iterative refinement scheme for the quasi-triangular Sylvester equation, and our rounding error analysis provides sufficient conditions for convergence and a bound on the attainable relative residual. We leverage this iterative scheme to solve the general Sylvester equation in mixed precision. The new algorithms compute the Schur decomposition of the matrix coefficients in low precision, use the low-precision Schur factors to obtain an approximate solution to the quasi-triangular equation, and iteratively refine it to obtain a working-precision solution to the quasi-triangular equation. However, being only orthonormal to low precision, the unitary Schur factors of $A$ and $B$ cannot be used to recover the solution to the original equation. We propose two effective approaches to address this issue: one is based on re-orthonormalization in the working precision, and the other on explicit inversion of the almost-unitary factors. We test these mixed-precision algorithms on various Sylvester and Lyapunov equations from the literature. Our numerical experiments show that, for both classes of equations, the new algorithms are at least as accurate as existing ones. Our cost analysis, on the other hand, suggests that they would typically be faster than mono-precision alternatives if implemented on hardware that natively supports low precision.
△ Less
Submitted 5 March, 2025;
originally announced March 2025.
-
Computing accurate eigenvalues using a mixed-precision Jacobi algorithm
Authors:
Nicholas J. Higham,
Françoise Tisseur,
Marcus Webb,
Zhengbo Zhou
Abstract:
We provide a rounding error analysis of a mixed-precision preconditioned Jacobi algorithm, which uses low precision to compute the preconditioner, applies it at high precision (amounting to two matrix-matrix multiplications) and solves the eigenproblem using the Jacobi algorithm at working precision. Our analysis yields meaningfully smaller relative forward error bounds for the computed eigenvalue…
▽ More
We provide a rounding error analysis of a mixed-precision preconditioned Jacobi algorithm, which uses low precision to compute the preconditioner, applies it at high precision (amounting to two matrix-matrix multiplications) and solves the eigenproblem using the Jacobi algorithm at working precision. Our analysis yields meaningfully smaller relative forward error bounds for the computed eigenvalues compared with those of the Jacobi algorithm. We further prove that, after preconditioning, if the off-diagonal entries of the preconditioned matrix are sufficiently small relative to its smallest diagonal entry, the relative forward error bound is independent of the condition number of the original matrix. We present two constructions for the preconditioner that exploit low precision, along with their error analyses. Our numerical experiments confirm our theoretical results and compare the relative forward error of the proposed algorithm with the Jacobi algorithm, a preconditioned Jacobi algorithm, and MATLAB's $\texttt{eig}$ function. Timings using Julia suggest that the dominant cost of obtaining this level of accuracy comes from the high precision matrix-matrix multiplies; if support in software or hardware for this were improved then this would become a negligible cost.
△ Less
Submitted 9 June, 2025; v1 submitted 7 January, 2025;
originally announced January 2025.
-
The Power of Bidiagonal Matrices
Authors:
Nicholas J. Higham
Abstract:
Bidiagonal matrices are widespread in numerical linear algebra, not least because of their use in the standard algorithm for computing the singular value decomposition and their appearance as LU factors of tridiagonal matrices. We show that bidiagonal matrices have a number of interesting properties that make them powerful tools in a variety of problems, especially when they are multiplied togethe…
▽ More
Bidiagonal matrices are widespread in numerical linear algebra, not least because of their use in the standard algorithm for computing the singular value decomposition and their appearance as LU factors of tridiagonal matrices. We show that bidiagonal matrices have a number of interesting properties that make them powerful tools in a variety of problems, especially when they are multiplied together. We show that the inverse of a product of bidiagonal matrices is insensitive to small componentwise relative perturbations in the factors if the factors or their inverses are nonnegative. We derive componentwise rounding error bounds for the solution of a linear system $Ax = b$, where $A$ or $A^{-1}$ is a product $B_1 B_2\dots B_k$ of bidiagonal matrices, showing that strong results are obtained when the $B_i$ are nonnegative or have a checkerboard sign pattern. We show that given the \fact\ of an $n\times n$ totally nonnegative matrix $A$ into the product of bidiagonal matrices, $\|A^{-1}\|_{\infty}$ can be computed in $O(n^2)$ flops and that in floating-point arithmetic the computed result has small relative error, no matter how large $\|A^{-1}\|_{\infty}$ is. We also show how factorizations involving bidiagonal matrices of some special matrices, such as the Frank matrix and the Kac--Murdock--Szegö matrix, yield simple proofs of the total nonnegativity and other properties of these matrices.
△ Less
Submitted 11 November, 2023;
originally announced November 2023.
-
Integer matrix factorisations, superalgebras and the quadratic form obstruction
Authors:
Nicholas J. Higham,
Matthew C. Lettington,
Karl Michael Schmidt
Abstract:
We identify and analyse obstructions to factorisation of integer matrices into products $N^T N$ or $N^2$ of matrices with rational or integer entries. The obstructions arise as quadratic forms with integer coefficients and raise the question of the discrete range of such forms. They are obtained by considering matrix decompositions over a superalgebra. We further obtain a formula for the determina…
▽ More
We identify and analyse obstructions to factorisation of integer matrices into products $N^T N$ or $N^2$ of matrices with rational or integer entries. The obstructions arise as quadratic forms with integer coefficients and raise the question of the discrete range of such forms. They are obtained by considering matrix decompositions over a superalgebra. We further obtain a formula for the determinant of a square matrix in terms of adjugates of these matrix decompositions, as well as identifying a $\it co-Latin$ symmetry space.
△ Less
Submitted 6 March, 2021;
originally announced March 2021.
-
A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic
Authors:
Ahmad Abdelfattah,
Hartwig Anzt,
Erik G. Boman,
Erin Carson,
Terry Cojean,
Jack Dongarra,
Mark Gates,
Thomas Grützmacher,
Nicholas J. Higham,
Sherry Li,
Neil Lindquist,
Yang Liu,
Jennifer Loe,
Piotr Luszczek,
Pratik Nayak,
Sri Pranesh,
Siva Rajamanickam,
Tobias Ribizel,
Barry Smith,
Kasia Swirydowicz,
Stephen Thomas,
Stanimire Tomov,
Yaohung M. Tsai,
Ichitaro Yamazaki,
Urike Meier Yang
Abstract:
Within the past years, hardware vendors have started designing low precision special function units in response to the demand of the Machine Learning community and their demand for high compute power in low precision formats. Also the server-line products are increasingly featuring low-precision special function units, such as the NVIDIA tensor cores in ORNL's Summit supercomputer providing more t…
▽ More
Within the past years, hardware vendors have started designing low precision special function units in response to the demand of the Machine Learning community and their demand for high compute power in low precision formats. Also the server-line products are increasingly featuring low-precision special function units, such as the NVIDIA tensor cores in ORNL's Summit supercomputer providing more than an order of magnitude higher performance than what is available in IEEE double precision. At the same time, the gap between the compute power on the one hand and the memory bandwidth on the other hand keeps increasing, making data access and communication prohibitively expensive compared to arithmetic operations. To start the multiprecision focus effort, we survey the numerical linear algebra community and summarize all existing multiprecision knowledge, expertise, and software capabilities in this landscape analysis report. We also include current efforts and preliminary results that may not yet be considered "mature technology," but have the potential to grow into production quality within the multiprecision focus effort. As we expect the reader to be familiar with the basics of numerical linear algebra, we refrain from providing a detailed background on the algorithms themselves but focus on how mixed- and multiprecision technology can help improving the performance of these methods and present highlights of application significantly outperforming the traditional fixed precision methods.
△ Less
Submitted 13 July, 2020;
originally announced July 2020.
-
Accurate Computation of the Log-Sum-Exp and Softmax Functions
Authors:
Pierre Blanchard,
Desmond J. Higham,
Nicholas J. Higham
Abstract:
Evaluating the log-sum-exp function or the softmax function is a key step in many modern data science algorithms, notably in inference and classification. Because of the exponentials that these functions contain, the evaluation is prone to overflow and underflow, especially in low precision arithmetic. Software implementations commonly use alternative formulas that avoid overflow and reduce the ch…
▽ More
Evaluating the log-sum-exp function or the softmax function is a key step in many modern data science algorithms, notably in inference and classification. Because of the exponentials that these functions contain, the evaluation is prone to overflow and underflow, especially in low precision arithmetic. Software implementations commonly use alternative formulas that avoid overflow and reduce the chance of harmful underflow, employing a shift or another rewriting. Although mathematically equivalent, these variants behave differently in floating-point arithmetic. We give rounding error analyses of different evaluation algorithms and interpret the error bounds using condition numbers for the functions. We conclude, based on the analysis and numerical experiments, that the shifted formulas are of similar accuracy to the unshifted ones and that the shifted softmax formula is typically more accurate than a division-free variant.
△ Less
Submitted 8 September, 2019;
originally announced September 2019.
-
Computing the Action of Trigonometric and Hyperbolic Matrix Functions
Authors:
Nicholas J. Higham,
Peter Kandolf
Abstract:
We derive a new algorithm for computing the action $f(A)V$ of the cosine, sine, hyperbolic cosine, and hyperbolic sine of a matrix $A$ on a matrix $V$, without first computing $f(A)$. The algorithm can compute $\cos(A)V$ and $\sin(A)V$ simultaneously, and likewise for $\cosh(A)V$ and $\sinh(A)V$, and it uses only real arithmetic when $A$ is real. The algorithm exploits an existing algorithm \textt…
▽ More
We derive a new algorithm for computing the action $f(A)V$ of the cosine, sine, hyperbolic cosine, and hyperbolic sine of a matrix $A$ on a matrix $V$, without first computing $f(A)$. The algorithm can compute $\cos(A)V$ and $\sin(A)V$ simultaneously, and likewise for $\cosh(A)V$ and $\sinh(A)V$, and it uses only real arithmetic when $A$ is real. The algorithm exploits an existing algorithm \texttt{expmv} of Al-Mohy and Higham for $\mathrm{e}^AV$ and its underlying backward error analysis. Our experiments show that the new algorithm performs in a forward stable manner and is generally significantly faster than alternatives based on multiple invocations of \texttt{expmv} through formulas such as $\cos(A)V = (\mathrm{e}^{\mathrm{i}A}V + \mathrm{e}^{\mathrm{-i}A}V)/2$.
△ Less
Submitted 29 April, 2017; v1 submitted 14 July, 2016;
originally announced July 2016.