-
Mixed Precision Orthogonalization-Free Projection Methods for Eigenvalue and Singular Value Problems
Authors:
Tianshi Xu,
Zechen Zhang,
Jie Chen,
Yousef Saad,
Yuanzhe Xi
Abstract:
Mixed-precision arithmetic offers significant computational advantages for large-scale matrix computation tasks, yet preserving accuracy and stability in eigenvalue problems and the singular value decomposition (SVD) remains challenging. This paper introduces an approach that eliminates orthogonalization requirements in traditional Rayleigh-Ritz projection methods. The proposed method employs non-…
▽ More
Mixed-precision arithmetic offers significant computational advantages for large-scale matrix computation tasks, yet preserving accuracy and stability in eigenvalue problems and the singular value decomposition (SVD) remains challenging. This paper introduces an approach that eliminates orthogonalization requirements in traditional Rayleigh-Ritz projection methods. The proposed method employs non-orthogonal bases computed at reduced precision, resulting in bases computed without inner-products. A primary focus is on maintaining the linear independence of the basis vectors. Through extensive evaluation with both synthetic test cases and real-world applications, we demonstrate that the proposed approach achieves the desired accuracy while fully taking advantage of mixed-precision arithmetic.
△ Less
Submitted 1 May, 2025; v1 submitted 1 May, 2025;
originally announced May 2025.
-
Cucheb: A GPU implementation of the filtered Lanczos procedure
Authors:
Jared L. Aurentz,
Vassilis Kalantzis,
Yousef Saad
Abstract:
This paper describes the software package Cucheb, a GPU implementation of the filtered Lanczos procedure for the solution of large sparse symmetric eigenvalue problems. The filtered Lanczos procedure uses a carefully chosen polynomial spectral transformation to accelerate convergence of the Lanczos method when computing eigenvalues within a desired interval. This method has proven particularly eff…
▽ More
This paper describes the software package Cucheb, a GPU implementation of the filtered Lanczos procedure for the solution of large sparse symmetric eigenvalue problems. The filtered Lanczos procedure uses a carefully chosen polynomial spectral transformation to accelerate convergence of the Lanczos method when computing eigenvalues within a desired interval. This method has proven particularly effective for eigenvalue problems that arise in electronic structure calculations and density functional theory. We compare our implementation against an equivalent CPU implementation and show that using the GPU can reduce the computation time by more than a factor of 10.
△ Less
Submitted 23 September, 2024;
originally announced September 2024.
-
Joint Approximate Partial Diagonalization of Large Matrices
Authors:
Abd-Krim Seghouane,
Yousef Saad
Abstract:
Given a set of $p$ symmetric (real) matrices, the Orthogonal Joint Diagonalization (OJD) problem consists of finding an orthonormal basis in which the representation of each of these $p$ matrices is as close as possible to a diagonal matrix. We argue that when the matrices are of large dimension, then the natural generalization of this problem is to seek an orthonormal basis of a certain subspace…
▽ More
Given a set of $p$ symmetric (real) matrices, the Orthogonal Joint Diagonalization (OJD) problem consists of finding an orthonormal basis in which the representation of each of these $p$ matrices is as close as possible to a diagonal matrix. We argue that when the matrices are of large dimension, then the natural generalization of this problem is to seek an orthonormal basis of a certain subspace that is a near eigenspace for all the matrices in the set. We refer to this as the problem of ``partial joint diagonalization of matrices.'' The approach proposed first finds this approximate common near eigenspace and then proceeds to a joint diagonalization of the restrictions of the input matrices in this subspace. A few solution methods for this problem are proposed and illustrations of its potential applications are provided.
△ Less
Submitted 3 September, 2024;
originally announced September 2024.
-
Straggler-tolerant stationary methods for linear systems
Authors:
Vassilis Kalantzis,
Yuanzhe Xi,
Lior Horesh,
Yousef Saad
Abstract:
In this paper, we consider the iterative solution of linear algebraic equations under the condition that matrix-vector products with the coefficient matrix are computed only partially. At the same time, non-computed entries are set to zeros. We assume that both the number of computed entries and their associated row index set are random variables, with the row index set sampled uniformly given the…
▽ More
In this paper, we consider the iterative solution of linear algebraic equations under the condition that matrix-vector products with the coefficient matrix are computed only partially. At the same time, non-computed entries are set to zeros. We assume that both the number of computed entries and their associated row index set are random variables, with the row index set sampled uniformly given the number of computed entries. This model of computations is realized in hybrid cloud computing architectures following the controller-worker distributed model under the influence of straggling workers. We propose straggler-tolerant Richardson iteration scheme and Chebyshev semi-iterative schemes, and prove sufficient conditions for their convergence in expectation. Numerical experiments verify the presented theoretical results as well as the effectiveness of the proposed schemes on a few sparse matrix problems.
△ Less
Submitted 12 October, 2024; v1 submitted 1 July, 2024;
originally announced July 2024.
-
Anderson Acceleration with Truncated Gram-Schmidt
Authors:
Ziyuan Tang,
Tianshi Xu,
Huan He,
Yousef Saad,
Yuanzhe Xi
Abstract:
Anderson Acceleration (AA) is a popular algorithm designed to enhance the convergence of fixed-point iterations. In this paper, we introduce a variant of AA based on a Truncated Gram-Schmidt process (AATGS) which has a few advantages over the classical AA. In particular, an attractive feature of AATGS is that its iterates obey a three-term recurrence in the situation when it is applied to solving…
▽ More
Anderson Acceleration (AA) is a popular algorithm designed to enhance the convergence of fixed-point iterations. In this paper, we introduce a variant of AA based on a Truncated Gram-Schmidt process (AATGS) which has a few advantages over the classical AA. In particular, an attractive feature of AATGS is that its iterates obey a three-term recurrence in the situation when it is applied to solving symmetric linear problems and this can lead to a considerable reduction of memory and computational costs. We analyze the convergence of AATGS in both full-depth and limited-depth scenarios and establish its equivalence to the classical AA in the linear case. We also report on the effectiveness of AATGS through a set of numerical experiments, ranging from solving nonlinear partial differential equations to tackling nonlinear optimization problems. In particular, the performance of the method is compared with that of the classical AA algorithms.
△ Less
Submitted 16 July, 2024; v1 submitted 22 March, 2024;
originally announced March 2024.
-
Gradient-type subspace iteration methods for the symmetric eigenvalue problem
Authors:
Foivos Alimisis,
Yousef Saad,
Bart Vandereycken
Abstract:
This paper explores variants of the subspace iteration algorithm for computing approximate invariant subspaces. The standard subspace iteration approach is revisited and new variants that exploit gradient-type techniques combined with a Grassmann manifold viewpoint are developed. A gradient method as well as a nonlinear conjugate gradient technique are described. Convergence of the gradient-based…
▽ More
This paper explores variants of the subspace iteration algorithm for computing approximate invariant subspaces. The standard subspace iteration approach is revisited and new variants that exploit gradient-type techniques combined with a Grassmann manifold viewpoint are developed. A gradient method as well as a nonlinear conjugate gradient technique are described. Convergence of the gradient-based algorithm is analyzed and a few numerical experiments are reported, indicating that the proposed algorithms are sometimes superior to standard algorithms. This includes the Chebyshev-based subspace iteration and the locally optimal block conjugate gradient method, when compared in terms of number of matrix vector products and computational time, resp. The new methods, on the other hand, do not require estimating optimal parameters. An important contribution of this paper to achieve this good performance is the accurate and efficient implementation of an exact line search. In addition, new convergence proofs are presented for the non-accelerated gradient method that includes a locally exponential convergence if started in a $\mathcal{O(\sqrtδ)}$ neighbourhood of the dominant subspace with spectral gap $δ$.
△ Less
Submitted 12 May, 2024; v1 submitted 17 June, 2023;
originally announced June 2023.
-
NLTGCR: A class of Nonlinear Acceleration Procedures based on Conjugate Residuals
Authors:
Huan He,
Ziyuan Tang,
Shifan Zhao,
Yousef Saad,
Yuanzhe Xi
Abstract:
This paper develops a new class of nonlinear acceleration algorithms based on extending conjugate residual-type procedures from linear to nonlinear equations. The main algorithm has strong similarities with Anderson acceleration as well as with inexact Newton methods - depending on which variant is implemented. We prove theoretically and verify experimentally, on a variety of problems from simulat…
▽ More
This paper develops a new class of nonlinear acceleration algorithms based on extending conjugate residual-type procedures from linear to nonlinear equations. The main algorithm has strong similarities with Anderson acceleration as well as with inexact Newton methods - depending on which variant is implemented. We prove theoretically and verify experimentally, on a variety of problems from simulation experiments to deep learning applications, that our method is a powerful accelerated iterative algorithm.
△ Less
Submitted 30 March, 2024; v1 submitted 31 May, 2023;
originally announced June 2023.
-
On the tubular eigenvalues of third-order tensors
Authors:
Fatemeh P. A. Beik,
Yousef Saad
Abstract:
This paper introduces the notion of tubular eigenvalues of third-order tensors with respect to T-products of tensors and analyzes their properties. A focus of the paper is to discuss relations between tubular eigenvalues and two alternative definitions of eigenvalue for third-order tensors that are known in the literature, namely eigentuple and T-eigenvalue. In addition, it establishes a few resul…
▽ More
This paper introduces the notion of tubular eigenvalues of third-order tensors with respect to T-products of tensors and analyzes their properties. A focus of the paper is to discuss relations between tubular eigenvalues and two alternative definitions of eigenvalue for third-order tensors that are known in the literature, namely eigentuple and T-eigenvalue. In addition, it establishes a few results on tubular spectra of tensors which can be exploited to analyze the convergence of tubular versions of iterative methods for solving tensor equations.
△ Less
Submitted 10 May, 2023;
originally announced May 2023.
-
An Efficient Nonlinear Acceleration method that Exploits Symmetry of the Hessian
Authors:
Huan He,
Shifan Zhao,
Ziyuan Tang,
Joyce C Ho,
Yousef Saad,
Yuanzhe Xi
Abstract:
Nonlinear acceleration methods are powerful techniques to speed up fixed-point iterations. However, many acceleration methods require storing a large number of previous iterates and this can become impractical if computational resources are limited. In this paper, we propose a nonlinear Truncated Generalized Conjugate Residual method (nlTGCR) whose goal is to exploit the symmetry of the Hessian to…
▽ More
Nonlinear acceleration methods are powerful techniques to speed up fixed-point iterations. However, many acceleration methods require storing a large number of previous iterates and this can become impractical if computational resources are limited. In this paper, we propose a nonlinear Truncated Generalized Conjugate Residual method (nlTGCR) whose goal is to exploit the symmetry of the Hessian to reduce memory usage. The proposed method can be interpreted as either an inexact Newton or a quasi-Newton method. We show that, with the help of global strategies like residual check techniques, nlTGCR can converge globally for general nonlinear problems and that under mild conditions, nlTGCR is able to achieve superlinear convergence. We further analyze the convergence of nlTGCR in a stochastic setting. Numerical results demonstrate the superiority of nlTGCR when compared with several other competitive baseline approaches on a few problems. Our code will be available in the future.
△ Less
Submitted 22 October, 2022;
originally announced October 2022.
-
parGeMSLR: A Parallel Multilevel Schur Complement Low-Rank Preconditioning and Solution Package for General Sparse Matrices
Authors:
Tianshi Xu,
Vassilis Kalantzis,
Ruipeng Li,
Yuanzhe Xi,
Geoffrey Dillon,
Yousef Saad
Abstract:
This paper discusses parGeMSLR, a C++/MPI software library for the solution of sparse systems of linear algebraic equations via preconditioned Krylov subspace methods in distributed-memory computing environments. The preconditioner implemented in parGeMSLR is based on algebraic domain decomposition and partitions the symmetrized adjacency graph recursively into several non-overlapping partitions v…
▽ More
This paper discusses parGeMSLR, a C++/MPI software library for the solution of sparse systems of linear algebraic equations via preconditioned Krylov subspace methods in distributed-memory computing environments. The preconditioner implemented in parGeMSLR is based on algebraic domain decomposition and partitions the symmetrized adjacency graph recursively into several non-overlapping partitions via a p-way vertex separator, where p is an integer multiple of the total number of MPI processes. From a numerical perspective, parGeMSLR builds a Schur complement approximate inverse preconditioner as the sum between the matrix inverse of the interface coupling matrix and a low-rank correction term. To reduce the cost associated with the computation of the approximate inverse matrices, parGeMSLR exploits a multilevel partitioning of the algebraic domain. The parGeMSLR library is implemented on top of the Message Passing Interface and can solve both real and complex linear systems. Furthermore, parGeMSLR can take advantage of hybrid computing environments with in-node access to one or more Graphics Processing Units. Finally, the parallel efficiency (weak and strong scaling) of parGeMSLR is demonstrated on a few model problems arising from discretizations of 3D Partial Differential Equations.
△ Less
Submitted 4 May, 2022;
originally announced May 2022.
-
GDA-AM: On the effectiveness of solving minimax optimization via Anderson Acceleration
Authors:
Huan He,
Shifan Zhao,
Yuanzhe Xi,
Joyce C Ho,
Yousef Saad
Abstract:
Many modern machine learning algorithms such as generative adversarial networks (GANs) and adversarial training can be formulated as minimax optimization. Gradient descent ascent (GDA) is the most commonly used algorithm due to its simplicity. However, GDA can converge to non-optimal minimax points. We propose a new minimax optimization framework, GDA-AM, that views the GDAdynamics as a fixed-poin…
▽ More
Many modern machine learning algorithms such as generative adversarial networks (GANs) and adversarial training can be formulated as minimax optimization. Gradient descent ascent (GDA) is the most commonly used algorithm due to its simplicity. However, GDA can converge to non-optimal minimax points. We propose a new minimax optimization framework, GDA-AM, that views the GDAdynamics as a fixed-point iteration and solves it using Anderson Mixing to con-verge to the local minimax. It addresses the diverging issue of simultaneous GDAand accelerates the convergence of alternating GDA. We show theoretically that the algorithm can achieve global convergence for bilinear problems under mild conditions. We also empirically show that GDA-AMsolves a variety of minimax problems and improves GAN training on several datasets
△ Less
Submitted 29 June, 2022; v1 submitted 5 October, 2021;
originally announced October 2021.
-
Graph coarsening: From scientific computing to machine learning
Authors:
Jie Chen,
Yousef Saad,
Zechen Zhang
Abstract:
The general method of graph coarsening or graph reduction has been a remarkably useful and ubiquitous tool in scientific computing and it is now just starting to have a similar impact in machine learning. The goal of this paper is to take a broad look into coarsening techniques that have been successfully deployed in scientific computing and see how similar principles are finding their way in more…
▽ More
The general method of graph coarsening or graph reduction has been a remarkably useful and ubiquitous tool in scientific computing and it is now just starting to have a similar impact in machine learning. The goal of this paper is to take a broad look into coarsening techniques that have been successfully deployed in scientific computing and see how similar principles are finding their way in more recent applications related to machine learning. In scientific computing, coarsening plays a central role in algebraic multigrid methods as well as the related class of multilevel incomplete LU factorizations. In machine learning, graph coarsening goes under various names, e.g., graph downsampling or graph reduction. Its goal in most cases is to replace some original graph by one which has fewer nodes, but whose structure and characteristics are similar to those of the original graph. As will be seen, a common strategy in these methods is to rely on spectral properties to define the coarse graph.
△ Less
Submitted 22 June, 2021;
originally announced June 2021.
-
Shanks and Anderson-type acceleration techniques for systems of nonlinear equations
Authors:
Claude Brezinski,
Stefano Cipolla,
Michela Redivo-Zaglia,
Yousef Saad
Abstract:
This paper examines a number of extrapolation and acceleration methods, and introduces a few modifications of the standard Shanks transformation that deal with general sequences. One of the goals of the paper is to lay out a general framework that encompasses most of the known acceleration strategies. The paper also considers the Anderson Acceleration method under a new light and exploits a connec…
▽ More
This paper examines a number of extrapolation and acceleration methods, and introduces a few modifications of the standard Shanks transformation that deal with general sequences. One of the goals of the paper is to lay out a general framework that encompasses most of the known acceleration strategies. The paper also considers the Anderson Acceleration method under a new light and exploits a connection with quasi-Newton methods, in order to establish local linear convergence results of a stabilized version of Anderson Acceleration method. The methods are tested on a number of problems, including a few that arise from nonlinear Partial Differential Equations.
△ Less
Submitted 8 July, 2021; v1 submitted 11 July, 2020;
originally announced July 2020.
-
A power Schur complement Low-Rank correction preconditioner for general sparse linear systems
Authors:
Qingqing Zheng,
Yuanzhe Xi,
Yousef Saad
Abstract:
An effective power based parallel preconditioner is proposed for general large sparse linear systems. The preconditioner combines a power series expansion method with some low-rank correction techniques, where the Sherman-Morrison-Woodbury formula is utilized. A matrix splitting of the Schur complement is proposed to expand the power series. The number of terms used in the power series expansion c…
▽ More
An effective power based parallel preconditioner is proposed for general large sparse linear systems. The preconditioner combines a power series expansion method with some low-rank correction techniques, where the Sherman-Morrison-Woodbury formula is utilized. A matrix splitting of the Schur complement is proposed to expand the power series. The number of terms used in the power series expansion can control the approximation accuracy of the preconditioner to the inverse of the Schur complement. To construct the preconditioner, graph partitioning is invoked to reorder the original coefficient matrix, leading to a special block two-by-two matrix whose two off-diagonal submatrices are block diagonal. Variables corresponding to interface variables are obtained by solving a linear system with the coeffcient matrix being the Schur complement. For the variables related to the interior variables, one only needs to solve a block diagonal linear system. This can be performed efficiently in parallel. Various numerical examples are provided to illustrate that the efficiency of the proposed preconditioner.
△ Less
Submitted 3 February, 2020;
originally announced February 2020.
-
Iterative methods for linear systems of equations: A brief historical journey
Authors:
Yousef Saad
Abstract:
This paper presents a brief historical survey of iterative methods for solving linear systems of equations. The journey begins with Gauss who developed the first known method that can be termed iterative. The early 20th century saw good progress of these methods which were initially used to solve least-squares systems, and then linear systems arising from the discretization of partial different eq…
▽ More
This paper presents a brief historical survey of iterative methods for solving linear systems of equations. The journey begins with Gauss who developed the first known method that can be termed iterative. The early 20th century saw good progress of these methods which were initially used to solve least-squares systems, and then linear systems arising from the discretization of partial different equations. Then iterative methods received a big impetus in the 1950s - partly because of the development of computers. The survey does not attempt to be exhaustive. Rather, the aim is to underline the way of thinking at a specific time and to highlight the major ideas that steered the field.
△ Less
Submitted 2 August, 2019;
originally announced August 2019.
-
A rational approximation method for solving acoustic nonlinear eigenvalue problems
Authors:
Mohamed El-Guide,
Agnieszka Miedlar,
Yousef Saad
Abstract:
We present two approximation methods for computing eigenfrequencies and eigenmodes of large-scale nonlinear eigenvalue problems resulting from boundary element method (BEM) solutions of some types of acoustic eigenvalue problems in three-dimensional space. The main idea of the first method is to approximate the resulting boundary element matrix within a contour in the complex plane by a high accur…
▽ More
We present two approximation methods for computing eigenfrequencies and eigenmodes of large-scale nonlinear eigenvalue problems resulting from boundary element method (BEM) solutions of some types of acoustic eigenvalue problems in three-dimensional space. The main idea of the first method is to approximate the resulting boundary element matrix within a contour in the complex plane by a high accuracy rational approximation using the Cauchy integral formula. The second method is based on the Chebyshev interpolation within real intervals. A Rayleigh-Ritz procedure, which is suitable for parallelization is developed for both the Cauchy and the Chebyshev approximation methods when dealing with large-scale practical applications. The performance of the proposed methods is illustrated with a variety of benchmark examples and large-scale industrial applications with degrees of freedom varying from several hundred up to around two million.
△ Less
Submitted 7 June, 2019;
originally announced June 2019.
-
A rational approximation method for the nonlinear eigenvalue problem
Authors:
Yousef Saad,
Mohamed El-Guide,
Agnieszka Międlar
Abstract:
This paper presents a method for computing eigenvalues and eigenvectors for some types of nonlinear eigenvalue problems. The main idea is to approximate the functions involved in the eigenvalue problem by rational functions and then apply a form of linearization. Eigenpairs of the expanded form of this linearization are not extracted directly. Instead, its structure is exploited to develop a schem…
▽ More
This paper presents a method for computing eigenvalues and eigenvectors for some types of nonlinear eigenvalue problems. The main idea is to approximate the functions involved in the eigenvalue problem by rational functions and then apply a form of linearization. Eigenpairs of the expanded form of this linearization are not extracted directly. Instead, its structure is exploited to develop a scheme that allows to extract all eigenvalues in a certain region of the complex plane by solving an eigenvalue problem of much smaller dimension. Because of its simple implementation and the ability to work efficiently in large dimensions, the presented method is appealing when solving challenging engineering problems. A few theoretical results are established to explain why the new approach works and numerical experiments are presented to validate the proposed algorithm.
△ Less
Submitted 9 June, 2020; v1 submitted 4 January, 2019;
originally announced January 2019.
-
Solving the 3D High-Frequency Helmholtz Equation using Contour Integration and Polynomial Preconditioning
Authors:
Xiao Liu,
Yuanzhe Xi,
Yousef Saad,
Maarten V. de Hoop
Abstract:
We propose an iterative solution method for the 3D high-frequency Helmholtz equation that exploits a contour integral formulation of spectral projectors. In this framework, the solution in certain invariant subspaces is approximated by solving complex-shifted linear systems, resulting in faster GMRES iterations due to the restricted spectrum. The shifted systems are solved by exploiting a polynomi…
▽ More
We propose an iterative solution method for the 3D high-frequency Helmholtz equation that exploits a contour integral formulation of spectral projectors. In this framework, the solution in certain invariant subspaces is approximated by solving complex-shifted linear systems, resulting in faster GMRES iterations due to the restricted spectrum. The shifted systems are solved by exploiting a polynomial fixed-point iteration, which is a robust scheme even if the magnitude of the shift is small. Numerical tests in 3D indicate that $O(n^{1/3})$ matrix-vector products are needed to solve a high-frequency problem with a matrix size $n$ with high accuracy. The method has a small storage requirement, can be applied to both dense and sparse linear systems, and is highly parallelizable.
△ Less
Submitted 29 November, 2018;
originally announced November 2018.
-
Find the dimension that counts: Fast dimension estimation and Krylov PCA
Authors:
Shashanka Ubaru,
Abd-Krim Seghouane,
Yousef Saad
Abstract:
High dimensional data and systems with many degrees of freedom are often characterized by covariance matrices. In this paper, we consider the problem of simultaneously estimating the dimension of the principal (dominant) subspace of these covariance matrices and obtaining an approximation to the subspace. This problem arises in the popular principal component analysis (PCA), and in many applicatio…
▽ More
High dimensional data and systems with many degrees of freedom are often characterized by covariance matrices. In this paper, we consider the problem of simultaneously estimating the dimension of the principal (dominant) subspace of these covariance matrices and obtaining an approximation to the subspace. This problem arises in the popular principal component analysis (PCA), and in many applications of machine learning, data analysis, signal and image processing, and others. We first present a novel method for estimating the dimension of the principal subspace. We then show how this method can be coupled with a Krylov subspace method to simultaneously estimate the dimension and obtain an approximation to the subspace. The dimension estimation is achieved at no additional cost. The proposed method operates on a model selection framework, where the novel selection criterion is derived based on random matrix perturbation theory ideas. We present theoretical analyses which (a) show that the proposed method achieves strong consistency (i.e., yields optimal solution as the number of data-points $n\rightarrow \infty$), and (b) analyze conditions for exact dimension estimation in the finite $n$ case. Using recent results, we show that our algorithm also yields near optimal PCA. The proposed method avoids forming the sample covariance matrix (associated with the data) explicitly and computing the complete eigen-decomposition. Therefore, the method is inexpensive, which is particularly advantageous in modern data applications where the covariance matrices can be very large. Numerical experiments illustrate the performance of the proposed method in various applications.
△ Less
Submitted 8 October, 2018;
originally announced October 2018.
-
Spectrum-Adapted Polynomial Approximation for Matrix Functions
Authors:
Li Fan,
David I Shuman,
Shashanka Ubaru,
Yousef Saad
Abstract:
We propose and investigate two new methods to approximate $f({\bf A}){\bf b}$ for large, sparse, Hermitian matrices ${\bf A}$. The main idea behind both methods is to first estimate the spectral density of ${\bf A}$, and then find polynomials of a fixed order that better approximate the function $f$ on areas of the spectrum with a higher density of eigenvalues. Compared to state-of-the-art methods…
▽ More
We propose and investigate two new methods to approximate $f({\bf A}){\bf b}$ for large, sparse, Hermitian matrices ${\bf A}$. The main idea behind both methods is to first estimate the spectral density of ${\bf A}$, and then find polynomials of a fixed order that better approximate the function $f$ on areas of the spectrum with a higher density of eigenvalues. Compared to state-of-the-art methods such as the Lanczos method and truncated Chebyshev expansion, the proposed methods tend to provide more accurate approximations of $f({\bf A}){\bf b}$ at lower polynomial orders, and for matrices ${\bf A}$ with a large number of distinct interior eigenvalues and a small spectral width.
△ Less
Submitted 28 August, 2018;
originally announced August 2018.
-
The Eigenvalues Slicing Library (EVSL): Algorithms, Implementation, and Software
Authors:
Ruipeng Li,
Yuanzhe Xi,
Lucas Erlandson,
Yousef Saad
Abstract:
This paper describes a software package called EVSL (for EigenValues Slicing Library) for solving large sparse real symmetric standard and generalized eigenvalue problems. As its name indicates, the package exploits spectrum slicing, a strategy that consists of dividing the spectrum into a number of subintervals and extracting eigenpairs from each subinterval independently. In order to enable such…
▽ More
This paper describes a software package called EVSL (for EigenValues Slicing Library) for solving large sparse real symmetric standard and generalized eigenvalue problems. As its name indicates, the package exploits spectrum slicing, a strategy that consists of dividing the spectrum into a number of subintervals and extracting eigenpairs from each subinterval independently. In order to enable such a strategy, the methods implemented in EVSL rely on a quick calculation of the spectral density of a given matrix, or a matrix pair. What distinguishes EVSL from other currently available packages is that EVSL relies entirely on filtering techniques. Polynomial and rational filtering are both implemented and are coupled with Krylov subspace methods and the subspace iteration algorithm. On the implementation side, the package offers interfaces for various scenarios including matrix-free modes, whereby the user can supply his/her own functions to perform matrix-vector operations or to solve sparse linear systems. The paper describes the algorithms in EVSL, provides details on their implementations, and discusses performance issues for the various methods.
△ Less
Submitted 14 February, 2018;
originally announced February 2018.
-
A Posteriori Error Estimate for Computing $\mathrm{tr}(f(A))$ by Using the Lanczos Method
Authors:
Jie Chen,
Yousef Saad
Abstract:
An outstanding problem when computing a function of a matrix, $f(A)$, by using a Krylov method is to accurately estimate errors when convergence is slow. Apart from the case of the exponential function which has been extensively studied in the past, there are no well-established solutions to the problem. Often the quantity of interest in applications is not the matrix $f(A)$ itself, but rather, ma…
▽ More
An outstanding problem when computing a function of a matrix, $f(A)$, by using a Krylov method is to accurately estimate errors when convergence is slow. Apart from the case of the exponential function which has been extensively studied in the past, there are no well-established solutions to the problem. Often the quantity of interest in applications is not the matrix $f(A)$ itself, but rather, matrix-vector products or bilinear forms. When the computation related to $f(A)$ is a building block of a larger problem (e.g., approximately computing its trace), a consequence of the lack of reliable error estimates is that the accuracy of the computed result is unknown. In this paper, we consider the problem of computing $\mathrm{tr}(f(A))$ for a symmetric positive-definite matrix $A$ by using the Lanczos method and make two contributions: (i) we propose an error estimate for the bilinear form associated with $f(A)$, and (ii) an error estimate for the trace of $f(A)$. We demonstrate the practical usefulness of these estimates for large matrices and in particular, show that the trace error estimate is indicative of the number of accurate digits. As an application, we compute the log-determinant of a covariance matrix in Gaussian process analysis and underline the importance of error tolerance as a stopping criterion, as a means of bounding the number of Lanczos steps to achieve a desired accuracy.
△ Less
Submitted 13 February, 2018;
originally announced February 2018.
-
Beyond AMLS: Domain decomposition with rational filtering
Authors:
Vassilis Kalantzis,
Yuanzhe Xi,
Yousef Saad
Abstract:
This paper proposes a rational filtering domain decomposition technique for the solution of large and sparse symmetric generalized eigenvalue problems. The proposed technique is purely algebraic and decomposes the eigenvalue problem associated with each subdomain into two disjoint subproblems. The first subproblem is associated with the interface variables and accounts for the interaction among ne…
▽ More
This paper proposes a rational filtering domain decomposition technique for the solution of large and sparse symmetric generalized eigenvalue problems. The proposed technique is purely algebraic and decomposes the eigenvalue problem associated with each subdomain into two disjoint subproblems. The first subproblem is associated with the interface variables and accounts for the interaction among neighboring subdomains. To compute the solution of the original eigenvalue problem at the interface variables we leverage ideas from contour integral eigenvalue solvers. The second subproblem is associated with the interior variables in each subdomain and can be solved in parallel among the different subdomains using real arithmetic only. Compared to rational filtering projection methods applied to the original matrix pencil, the proposed technique integrates only a part of the matrix resolvent while it applies any orthogonalization necessary to vectors whose length is equal to the number of interface variables. In addition, no estimation of the number of eigenvalues lying inside the interval of interest is needed. Numerical experiments performed in distributed memory architectures illustrate the competitiveness of the proposed technique against rational filtering Krylov approaches.
△ Less
Submitted 26 November, 2017;
originally announced November 2017.
-
Sampling and multilevel coarsening algorithms for fast matrix approximations
Authors:
Shashanka Ubaru,
Yousef Saad
Abstract:
This paper addresses matrix approximation problems for matrices that are large, sparse and/or that are representations of large graphs. To tackle these problems, we consider algorithms that are based primarily on coarsening techniques, possibly combined with random sampling. A multilevel coarsening technique is proposed which utilizes a hypergraph associated with the data matrix and a graph coarse…
▽ More
This paper addresses matrix approximation problems for matrices that are large, sparse and/or that are representations of large graphs. To tackle these problems, we consider algorithms that are based primarily on coarsening techniques, possibly combined with random sampling. A multilevel coarsening technique is proposed which utilizes a hypergraph associated with the data matrix and a graph coarsening strategy based on column matching. Theoretical results are established that characterize the quality of the dimension reduction achieved by a coarsening step, when a proper column matching strategy is employed. We consider a number of standard applications of this technique as well as a few new ones. Among the standard applications we first consider the problem of computing the partial SVD for which a combination of sampling and coarsening yields significantly improved SVD results relative to sampling alone. We also consider the Column subset selection problem, a popular low rank approximation method used in data related applications, and show how multilevel coarsening can be adapted for this problem. Similarly, we consider the problem of graph sparsification and show how coarsening techniques can be employed to solve it. Numerical experiments illustrate the performances of the methods in various applications.
△ Less
Submitted 1 October, 2018; v1 submitted 1 November, 2017;
originally announced November 2017.
-
Fast computation of spectral densities for generalized eigenvalue problems
Authors:
Yuanzhe Xi,
Ruipeng Li,
Yousef Saad
Abstract:
The distribution of the eigenvalues of a Hermitian matrix (or of a Hermitian matrix pencil) reveals important features of the underlying problem, whether a Hamiltonian system in physics, or a social network in behavioral sciences. However, computing all the eigenvalues explicitly is prohibitively expensive for real-world applications. This paper presents two types of methods to efficiently estimat…
▽ More
The distribution of the eigenvalues of a Hermitian matrix (or of a Hermitian matrix pencil) reveals important features of the underlying problem, whether a Hamiltonian system in physics, or a social network in behavioral sciences. However, computing all the eigenvalues explicitly is prohibitively expensive for real-world applications. This paper presents two types of methods to efficiently estimate the spectral density of a matrix pencil $(A, B)$ when both $A$ and $B$ are Hermitian and, in addition, $B$ is positive definite. The first one is based on the Kernel Polynomial Method (KPM) and the second on Gaussian quadrature by the Lanczos procedure. By employing Chebyshev polynomial approximation techniques, we can avoid direct factorizations in both methods, making the resulting algorithms suitable for large matrices. Under some assumptions, we prove bounds that suggest that the Lanczos method converges twice as fast as the KPM method. Numerical examples further indicate that the Lanczos method can provide more accurate spectral densities when the eigenvalue distribution is highly non-uniform. As an application, we show how to use the computed spectral density to partition the spectrum into intervals that contain roughly the same number of eigenvalues. This procedure, which makes it possible to compute the spectrum by parts, is a key ingredient in the new breed of eigensolvers that exploit "spectrum slicing".
△ Less
Submitted 20 June, 2017;
originally announced June 2017.
-
Solving Almost all Systems of Random Quadratic Equations
Authors:
Gang Wang,
Georgios B. Giannakis,
Yousef Saad,
Jie Chen
Abstract:
This paper deals with finding an $n$-dimensional solution $x$ to a system of quadratic equations of the form $y_i=|\langle{a}_i,x\rangle|^2$ for $1\le i \le m$, which is also known as phase retrieval and is NP-hard in general. We put forth a novel procedure for minimizing the amplitude-based least-squares empirical loss, that starts with a weighted maximal correlation initialization obtainable wit…
▽ More
This paper deals with finding an $n$-dimensional solution $x$ to a system of quadratic equations of the form $y_i=|\langle{a}_i,x\rangle|^2$ for $1\le i \le m$, which is also known as phase retrieval and is NP-hard in general. We put forth a novel procedure for minimizing the amplitude-based least-squares empirical loss, that starts with a weighted maximal correlation initialization obtainable with a few power or Lanczos iterations, followed by successive refinements based upon a sequence of iteratively reweighted (generalized) gradient iterations. The two (both the initialization and gradient flow) stages distinguish themselves from prior contributions by the inclusion of a fresh (re)weighting regularization technique. The overall algorithm is conceptually simple, numerically scalable, and easy-to-implement. For certain random measurement models, the novel procedure is shown capable of finding the true solution $x$ in time proportional to reading the data $\{(a_i;y_i)\}_{1\le i \le m}$. This holds with high probability and without extra assumption on the signal $x$ to be recovered, provided that the number $m$ of equations is some constant $c>0$ times the number $n$ of unknowns in the signal vector, namely, $m>cn$. Empirically, the upshots of this contribution are: i) (almost) $100\%$ perfect signal recovery in the high-dimensional (say e.g., $n\ge 2,000$) regime given only an information-theoretic limit number of noiseless equations, namely, $m=2n-1$ in the real-valued Gaussian case; and, ii) (nearly) optimal statistical accuracy in the presence of additive noise of bounded support. Finally, substantial numerical tests using both synthetic data and real images corroborate markedly improved signal recovery performance and computational efficiency of our novel procedure relative to state-of-the-art approaches.
△ Less
Submitted 29 May, 2017;
originally announced May 2017.
-
SMASH: Structured matrix approximation by separation and hierarchy
Authors:
Difeng Cai,
Edmond Chow,
Yousef Saad,
Yuanzhe Xi
Abstract:
This paper presents an efficient method to perform Structured Matrix Approximation by Separation and Hierarchy (SMASH), when the original dense matrix is associated with a kernel function. Given points in a domain, a tree structure is first constructed based on an adaptive partitioning of the computational domain to facilitate subsequent approximation procedures. In contrast to existing schemes ba…
▽ More
This paper presents an efficient method to perform Structured Matrix Approximation by Separation and Hierarchy (SMASH), when the original dense matrix is associated with a kernel function. Given points in a domain, a tree structure is first constructed based on an adaptive partitioning of the computational domain to facilitate subsequent approximation procedures. In contrast to existing schemes based on either analytic or purely algebraic approximations, SMASH takes advantage of both approaches and greatly improves the efficiency. The algorithm follows a bottom-up traversal of the tree and is able to perform the operations associated with each node on the same level in parallel. A strong rank-revealing factorization is applied to the initial analytic approximation in the separation regime so that a special structure is incorporated into the final nested bases. As a consequence, the storage is significantly reduced on one hand and a hierarchy of the original grid is constructed on the other hand. Due to this hierarchy, nested bases at upper levels can be computed in a similar way as the leaf level operations but on coarser grids. The main advantages of SMASH include its simplicity of implementation, its flexibility to construct various hierarchical rank structures and a low storage cost. Rigorous error analysis and complexity analysis are conducted to show that this scheme is fast and stable. The efficiency and robustness of SMASH are demonstrated through various test problems arising from integral equations, structured matrices, etc.
△ Less
Submitted 15 May, 2017;
originally announced May 2017.
-
Fast estimation of approximate matrix ranks using spectral densities
Authors:
Shashanka Ubaru,
Yousef Saad,
Abd-Krim Seghouane
Abstract:
In many machine learning and data related applications, it is required to have the knowledge of approximate ranks of large data matrices at hand. In this paper, we present two computationally inexpensive techniques to estimate the approximate ranks of such large matrices. These techniques exploit approximate spectral densities, popular in physics, which are probability density distributions that m…
▽ More
In many machine learning and data related applications, it is required to have the knowledge of approximate ranks of large data matrices at hand. In this paper, we present two computationally inexpensive techniques to estimate the approximate ranks of such large matrices. These techniques exploit approximate spectral densities, popular in physics, which are probability density distributions that measure the likelihood of finding eigenvalues of the matrix at a given point on the real line. Integrating the spectral density over an interval gives the eigenvalue count of the matrix in that interval. Therefore the rank can be approximated by integrating the spectral density over a carefully selected interval. Two different approaches are discussed to estimate the approximate rank, one based on Chebyshev polynomials and the other based on the Lanczos algorithm. In order to obtain the appropriate interval, it is necessary to locate a gap between the eigenvalues that correspond to noise and the relevant eigenvalues that contribute to the matrix rank. A method for locating this gap and selecting the interval of integration is proposed based on the plot of the spectral density. Numerical experiments illustrate the performance of these techniques on matrices from typical applications.
△ Less
Submitted 19 August, 2016;
originally announced August 2016.
-
Low rank approximation and decomposition of large matrices using error correcting codes
Authors:
Shashanka Ubaru,
Arya Mazumdar,
Yousef Saad
Abstract:
Low rank approximation is an important tool used in many applications of signal processing and machine learning. Recently, randomized sketching algorithms were proposed to effectively construct low rank approximations and obtain approximate singular value decompositions of large matrices. Similar ideas were used to solve least squares regression problems. In this paper, we show how matrices from e…
▽ More
Low rank approximation is an important tool used in many applications of signal processing and machine learning. Recently, randomized sketching algorithms were proposed to effectively construct low rank approximations and obtain approximate singular value decompositions of large matrices. Similar ideas were used to solve least squares regression problems. In this paper, we show how matrices from error correcting codes can be used to find such low rank approximations and matrix decompositions, and extend the framework to linear least squares regression problems. The benefits of using these code matrices are the following: (i) They are easy to generate and they reduce randomness significantly. (ii) Code matrices with mild properties satisfy the subspace embedding property, and have a better chance of preserving the geometry of an entire subspace of vectors. (iii) For parallel and distributed applications, code matrices have significant advantages over structured random matrices and Gaussian random matrices. (iv) Unlike Fourier or Hadamard transform matrices, which require sampling $O(k\log k)$ columns for a rank-$k$ approximation, the log factor is not necessary for certain types of code matrices. That is, $(1+ε)$ optimal Frobenius norm error can be achieved for a rank-$k$ approximation with $O(k/ε)$ samples. (v) Fast multiplication is possible with structured code matrices, so fast approximations can be achieved for general dense input matrices. (vi) For least squares regression problem $\min\|Ax-b\|_2$ where $A\in \mathbb{R}^{n\times d}$, the $(1+ε)$ relative error approximation can be achieved with $O(d/ε)$ samples, with high probability, when certain code matrices are used.
△ Less
Submitted 15 June, 2017; v1 submitted 30 December, 2015;
originally announced December 2015.
-
A Thick-Restart Lanczos algorithm with polynomial filtering for Hermitian eigenvalue problems
Authors:
Ruipeng Li,
Yuanzhe Xi,
Eugene Vecharynski,
Chao Yang,
Yousef Saad
Abstract:
Polynomial filtering can provide a highly effective means of computing all eigenvalues of a real symmetric (or complex Hermitian) matrix that are located in a given interval, anywhere in the spectrum. This paper describes a technique for tackling this problem by combining a Thick-Restart version of the Lanczos algorithm with deflation (`locking') and a new type of polynomial filters obtained from…
▽ More
Polynomial filtering can provide a highly effective means of computing all eigenvalues of a real symmetric (or complex Hermitian) matrix that are located in a given interval, anywhere in the spectrum. This paper describes a technique for tackling this problem by combining a Thick-Restart version of the Lanczos algorithm with deflation (`locking') and a new type of polynomial filters obtained from a least-squares technique. The resulting algorithm can be utilized in a `spectrum-slicing' approach whereby a very large number of eigenvalues and associated eigenvectors of the matrix are computed by extracting eigenpairs located in different sub-intervals independently from one another.
△ Less
Submitted 26 December, 2015;
originally announced December 2015.
-
Low-rank correction methods for algebraic domain decomposition preconditioners
Authors:
Ruipeng Li,
Yousef Saad
Abstract:
This paper presents a parallel preconditioning method for distributed sparse linear systems, based on an approximate inverse of the original matrix, that adopts a general framework of distributed sparse matrices and exploits the domain decomposition method and low-rank corrections. The domain decomposition approach decouples the matrix and once inverted, a low-rank approximation is applied by expl…
▽ More
This paper presents a parallel preconditioning method for distributed sparse linear systems, based on an approximate inverse of the original matrix, that adopts a general framework of distributed sparse matrices and exploits the domain decomposition method and low-rank corrections. The domain decomposition approach decouples the matrix and once inverted, a low-rank approximation is applied by exploiting the Sherman-Morrison-Woodbury formula, which yields two variants of the preconditioning methods. The low-rank expansion is computed by the Lanczos procedure with reorthogonalizations. Numerical experiments indicate that, when combined with Krylov subspace accelerators, this preconditioner can be efficient and robust for solving symmetric sparse linear systems. Comparisons with other distributed-memory preconditioning methods are presented.
△ Less
Submitted 29 May, 2015; v1 submitted 16 May, 2015;
originally announced May 2015.
-
Schur Complement based domain decomposition preconditioners with Low-rank corrections
Authors:
Ruipeng Li,
Yuanzhe Xi,
Yousef Saad
Abstract:
This paper introduces a robust preconditioner for general sparse symmetric matrices, that is based on low-rank approximations of the Schur complement in a Domain Decomposition (DD) framework. In this "Schur Low Rank" (SLR) preconditioning approach, the coefficient matrix is first decoupled by DD, and then a low-rank correction is exploited to compute an approximate inverse of the Schur complement…
▽ More
This paper introduces a robust preconditioner for general sparse symmetric matrices, that is based on low-rank approximations of the Schur complement in a Domain Decomposition (DD) framework. In this "Schur Low Rank" (SLR) preconditioning approach, the coefficient matrix is first decoupled by DD, and then a low-rank correction is exploited to compute an approximate inverse of the Schur complement associated with the interface points. The method avoids explicit formation of the Schur complement matrix. We show the feasibility of this strategy for a model problem, and conduct a detailed spectral analysis for the relationship between the low-rank correction and the quality of the preconditioning. Numerical experiments on general matrices illustrate the robustness and efficiency of the proposed approach.
△ Less
Submitted 16 May, 2015;
originally announced May 2015.
-
Fast updating algorithms for latent semantic indexing
Authors:
Eugene Vecharynski,
Yousef Saad
Abstract:
This paper discusses a few algorithms for updating the approximate Singular Value Decomposition (SVD) in the context of information retrieval by Latent Semantic Indexing (LSI) methods. A unifying framework is considered which is based on Rayleigh-Ritz projection methods. First, a Rayleigh-Ritz approach for the SVD is discussed and it is then used to interpret the Zha--Simon algorithms [SIAM J. Sci…
▽ More
This paper discusses a few algorithms for updating the approximate Singular Value Decomposition (SVD) in the context of information retrieval by Latent Semantic Indexing (LSI) methods. A unifying framework is considered which is based on Rayleigh-Ritz projection methods. First, a Rayleigh-Ritz approach for the SVD is discussed and it is then used to interpret the Zha--Simon algorithms [SIAM J. Scient. Comput. vol. 21 (1999), pp. 782-791]. This viewpoint leads to a few alternatives whose goal is to reduce computational cost and storage requirement by projection techniques that utilize subspaces of much smaller dimension. Numerical experiments show that the proposed algorithms yield accuracies comparable to those obtained from standard ones at a much lower computational cost.
△ Less
Submitted 13 May, 2014; v1 submitted 8 October, 2013;
originally announced October 2013.
-
Approximating spectral densities of large matrices
Authors:
Lin Lin,
Yousef Saad,
Chao Yang
Abstract:
In physics, it is sometimes desirable to compute the so-called \emph{Density Of States} (DOS), also known as the \emph{spectral density}, of a real symmetric matrix $A$. The spectral density can be viewed as a probability density distribution that measures the likelihood of finding eigenvalues near some point on the real line. The most straightforward way to obtain this density is to compute all e…
▽ More
In physics, it is sometimes desirable to compute the so-called \emph{Density Of States} (DOS), also known as the \emph{spectral density}, of a real symmetric matrix $A$. The spectral density can be viewed as a probability density distribution that measures the likelihood of finding eigenvalues near some point on the real line. The most straightforward way to obtain this density is to compute all eigenvalues of $A$. But this approach is generally costly and wasteful, especially for matrices of large dimension. There exists alternative methods that allow us to estimate the spectral density function at much lower cost. The major computational cost of these methods is in multiplying $A$ with a number of vectors, which makes them appealing for large-scale problems where products of the matrix $A$ with arbitrary vectors are relatively inexpensive. This paper defines the problem of estimating the spectral density carefully, and discusses how to measure the accuracy of an approximate spectral density. It then surveys a few known methods for estimating the spectral density, and proposes some new variations of existing methods. All methods are discussed from a numerical linear algebra point of view.
△ Less
Submitted 4 October, 2014; v1 submitted 25 August, 2013;
originally announced August 2013.
-
Efficient estimation of eigenvalue counts in an interval
Authors:
Edoardo Di Napoli,
Eric Polizzi,
Yousef Saad
Abstract:
Estimating the number of eigenvalues located in a given interval of a large sparse Hermitian matrix is an important problem in certain applications and it is a prerequisite of eigensolvers based on a divide-and-conquer paradigm. Often an exact count is not necessary and methods based on stochastic estimates can be utilized to yield rough approximations. This paper examines a number of techniques t…
▽ More
Estimating the number of eigenvalues located in a given interval of a large sparse Hermitian matrix is an important problem in certain applications and it is a prerequisite of eigensolvers based on a divide-and-conquer paradigm. Often an exact count is not necessary and methods based on stochastic estimates can be utilized to yield rough approximations. This paper examines a number of techniques tailored to this specific task. It reviews standard approaches and explores new ones based on polynomial and rational approximation filtering combined with a stochastic procedure.
△ Less
Submitted 5 August, 2014; v1 submitted 20 August, 2013;
originally announced August 2013.
-
Graph partitioning using matrix values for preconditioning symmetric positive definite systems
Authors:
Eugene Vecharynski,
Yousef Saad,
Masha Sosonkina
Abstract:
Prior to the parallel solution of a large linear system, it is required to perform a partitioning of its equations/unknowns. Standard partitioning algorithms are designed using the considerations of the efficiency of the parallel matrix-vector multiplication, and typically disregard the information on the coefficients of the matrix. This information, however, may have a significant impact on the q…
▽ More
Prior to the parallel solution of a large linear system, it is required to perform a partitioning of its equations/unknowns. Standard partitioning algorithms are designed using the considerations of the efficiency of the parallel matrix-vector multiplication, and typically disregard the information on the coefficients of the matrix. This information, however, may have a significant impact on the quality of the preconditioning procedure used within the chosen iterative scheme. In the present paper, we suggest a spectral partitioning algorithm, which takes into account the information on the matrix coefficients and constructs partitions with respect to the objective of enhancing the quality of the nonoverlapping additive Schwarz (block Jacobi) preconditioning for symmetric positive definite linear systems. For a set of test problems with large variations in magnitudes of matrix coefficients, our numerical experiments demonstrate a noticeable improvement in the convergence of the resulting solution scheme when using the new partitioning approach.
△ Less
Submitted 17 November, 2013; v1 submitted 26 October, 2011;
originally announced October 2011.