-
Towards a Scalable Hierarchical High-order CFD Solver
Authors:
Zan Xu,
Léopold Cambier,
Juan J. Alonso,
Eric Darve
Abstract:
Development of highly scalable and robust algorithms for large-scale CFD simulations has been identified as one of the key ingredients to achieve NASA's CFD Vision 2030 goals. In order to improve simulation capability and to effectively leverage new high-performance computing hardware, the most computationally intensive parts of CFD solution algorithms -- namely, linear solvers and preconditioners…
▽ More
Development of highly scalable and robust algorithms for large-scale CFD simulations has been identified as one of the key ingredients to achieve NASA's CFD Vision 2030 goals. In order to improve simulation capability and to effectively leverage new high-performance computing hardware, the most computationally intensive parts of CFD solution algorithms -- namely, linear solvers and preconditioners -- need to achieve asymptotic behavior on massively parallel and heterogeneous architectures and preserve convergence rates as the meshes are refined further. In this work, we present a scalable high-order implicit Discontinuous Galerkin solver from the SU2 framework using a promising preconditioning technique based on algebraic sparsified nested dissection algorithm with low-rank approximations, and communication-avoiding Krylov subspace methods to enable scalability with very large processor counts. The overall approach is tested on a canonical 2D NACA0012 test case of increasing size to demonstrate its scalability on multiple processing cores. Both the preconditioner and the linear solver are shown to exhibit near-linear weak scaling up to 2,048 cores with no significant degradation of the convergence rate.
△ Less
Submitted 5 January, 2021;
originally announced January 2021.
-
The Index of Invariance and its Implications for a Parameterized Least Squares Problem
Authors:
Léopold Cambier,
Rahul Sarkar
Abstract:
We study the problem $x_{b,ω} := \text{arg min}_{x \in \mathcal{S}} \|(A + ωI)^{-1/2} (b - Ax)\|_2$, with $A = A^*$, for a subspace $\mathcal{S}$ of $\mathbb{F}^n$ ($\mathbb{F} = \mathbb{R}$ or $\mathbb{C}$), and $ω> -λ_{min}(A)$. We show that there exists a subspace $\mathcal{Y}$ of $\mathbb{F}^n$, independent of $b$, such that $\{x_{b,ω} - x_{b,μ} \mid ω,μ> -λ_{min}(A)\} \subseteq \mathcal{Y}$,…
▽ More
We study the problem $x_{b,ω} := \text{arg min}_{x \in \mathcal{S}} \|(A + ωI)^{-1/2} (b - Ax)\|_2$, with $A = A^*$, for a subspace $\mathcal{S}$ of $\mathbb{F}^n$ ($\mathbb{F} = \mathbb{R}$ or $\mathbb{C}$), and $ω> -λ_{min}(A)$. We show that there exists a subspace $\mathcal{Y}$ of $\mathbb{F}^n$, independent of $b$, such that $\{x_{b,ω} - x_{b,μ} \mid ω,μ> -λ_{min}(A)\} \subseteq \mathcal{Y}$, where $\dim(\mathcal{Y}) \leq \dim(\mathcal{S} + A\mathcal{S}) - \dim(\mathcal{S}) = \mathbf{Ind}_A(\mathcal{S})$, a quantity which we call the index of invariance of $\mathcal{S}$ with respect to $A$. In particular if $\mathcal{S}$ is a Krylov subspace, this implies the low dimensionality result of Hallman & Gu (2018). The problem is also such that when $A$ is positive and $\mathcal{S}$ is a Krylov subspace, it reduces to CG for $ω= 0$ and to MINRES for $ω\to \infty$. We study several properties of $\mathbf{Ind}_A(\mathcal{S})$ in relation to $A$ and $\mathcal{S}$. We show that the dimension of the affine subspace $\mathcal{X}_b$ containing the solutions $x_{b,ω}$ can be smaller than $\mathbf{Ind}_A(\mathcal{S})$ for all $b$. However, we also exhibit some sufficient conditions on $A$ and $\mathcal{S}$, under which $\mathcal{X} := \text{Span}{\{x_{b,ω} - x_{b,μ} \mid b \in \mathbb{F}^n, ω,μ> -λ_{min}(A)\}}$ has dimension equal to $\mathbf{Ind}_A(\mathcal{S})$. We then study the injectivity of the map $ω\mapsto x_{b,ω}$, leading us to a proof of the convexity result from Hallman & Gu (2018). We finish by showing that sets such as $M(\mathcal{S},\mathcal{S}') = \{A \in \mathbb{F}^{n \times n} \mid \mathcal{S} + A\mathcal{S} = \mathcal{S}'\}$, for nested subspaces $\mathcal{S} \subseteq \mathcal{S}' \subseteq \mathbb{F}^n$, form smooth real manifolds, and explore some topological relationships between them.
△ Less
Submitted 1 September, 2020; v1 submitted 25 August, 2020;
originally announced August 2020.
-
Second Order Accurate Hierarchical Approximate Factorization of Sparse SPD Matrices
Authors:
Bazyli Klockiewicz,
Léopold Cambier,
Ryan Humble,
Hamdi Tchelepi,
Eric Darve
Abstract:
We describe a second-order accurate approach to sparsifying the off-diagonal blocks in the hierarchical approximate factorizations of sparse symmetric positive definite matrices. The norm of the error made by the new approach depends quadratically, not linearly, on the error in the low-rank approximation of the given block. The analysis of the resulting two-level preconditioner shows that the prec…
▽ More
We describe a second-order accurate approach to sparsifying the off-diagonal blocks in the hierarchical approximate factorizations of sparse symmetric positive definite matrices. The norm of the error made by the new approach depends quadratically, not linearly, on the error in the low-rank approximation of the given block. The analysis of the resulting two-level preconditioner shows that the preconditioner is second-order accurate as well. We incorporate the new approach into the recent Sparsified Nested Dissection algorithm [SIAM J. Matrix Anal. Appl., 41 (2020), pp. 715-746], and test it on a wide range of problems. The new approach halves the number of Conjugate Gradient iterations needed for convergence, with almost the same factorization complexity, improving the total runtimes of the algorithm. Our approach can be incorporated into other rank-structured methods for solving sparse linear systems.
△ Less
Submitted 3 August, 2020; v1 submitted 1 July, 2020;
originally announced July 2020.
-
An Algebraic Sparsified Nested Dissection Algorithm Using Low-Rank Approximations
Authors:
Léopold Cambier,
Chao Chen,
Erik G Boman,
Sivasankaran Rajamanickam,
Raymond S. Tuminaro,
Eric Darve
Abstract:
We propose a new algorithm for the fast solution of large, sparse, symmetric positive-definite linear systems, spaND -- sparsified Nested Dissection. It is based on nested dissection, sparsification and low-rank compression. After eliminating all interiors at a given level of the elimination tree, the algorithm sparsifies all separators corresponding to the interiors. This operation reduces the si…
▽ More
We propose a new algorithm for the fast solution of large, sparse, symmetric positive-definite linear systems, spaND -- sparsified Nested Dissection. It is based on nested dissection, sparsification and low-rank compression. After eliminating all interiors at a given level of the elimination tree, the algorithm sparsifies all separators corresponding to the interiors. This operation reduces the size of the separators by eliminating some degrees of freedom but without introducing any fill-in. This is done at the expense of a small and controllable approximation error. The result is an approximate factorization that can be used as an efficient preconditioner. We then perform several numerical experiments to evaluate this algorithm. We demonstrate that a version using orthogonal factorization and block-diagonal scaling takes fewer CG iterations to converge than previous similar algorithms on various kinds of problems. Furthermore, this algorithm is provably guaranteed to never break down and the matrix stays symmetric positive-definite throughout the process. We evaluate the algorithm on some large problems and show it exhibits near-linear scaling. The factorization time is roughly O(N) and the number of iterations grows slowly with N.
△ Less
Submitted 27 January, 2020; v1 submitted 9 January, 2019;
originally announced January 2019.
-
A Robust Hierarchical Solver for Ill-conditioned Systems with Applications to Ice Sheet Modeling
Authors:
Chao Chen,
Leopold Cambier,
Erik G. Boman,
Sivasankaran Rajamanickam,
Raymond S. Tuminaro,
Eric Darve
Abstract:
A hierarchical solver is proposed for solving sparse ill-conditioned linear systems in parallel. The solver is based on a modification of the LoRaSp method, but employs a deferred-compression technique, which provably reduces the approximation error and significantly improves efficiency. Moreover, the deferred-compression technique introduces minimal overhead and does not affect parallelism. As a…
▽ More
A hierarchical solver is proposed for solving sparse ill-conditioned linear systems in parallel. The solver is based on a modification of the LoRaSp method, but employs a deferred-compression technique, which provably reduces the approximation error and significantly improves efficiency. Moreover, the deferred-compression technique introduces minimal overhead and does not affect parallelism. As a result, the new solver achieves linear computational complexity under mild assumptions and excellent parallel scalability. To demonstrate the performance of the new solver, we focus on applying it to solve sparse linear systems arising from ice sheet modeling. The strong anisotropic phenomena associated with the thin structure of ice sheets creates serious challenges for existing solvers. To address the anisotropy, we additionally developed a customized partitioning scheme for the solver, which captures the strong-coupling direction accurately. In general, the partitioning can be computed algebraically with existing software packages, and thus the new solver is generalizable for solving other sparse linear systems. Our results show that ice sheet problems of about 300 million degrees of freedom have been solved in just a few minutes using a thousand processors.
△ Less
Submitted 29 November, 2018; v1 submitted 27 November, 2018;
originally announced November 2018.
-
Low-Rank Kernel Matrix Approximation Using Skeletonized Interpolation With Endo- or Exo-Vertices
Authors:
Zixi Xu,
Léopold Cambier,
François-Henry Rouet,
Pierre L'Eplatennier,
Yun Huang,
Cleve Ashcraft,
Eric Darve
Abstract:
The efficient compression of kernel matrices, for instance the off-diagonal blocks of discretized integral equations, is a crucial step in many algorithms. In this paper, we study the application of Skeletonized Interpolation to construct such factorizations. In particular, we study four different strategies for selecting the initial candidate pivots of the algorithm: Chebyshev grids, points on a…
▽ More
The efficient compression of kernel matrices, for instance the off-diagonal blocks of discretized integral equations, is a crucial step in many algorithms. In this paper, we study the application of Skeletonized Interpolation to construct such factorizations. In particular, we study four different strategies for selecting the initial candidate pivots of the algorithm: Chebyshev grids, points on a sphere, maximally-dispersed and random vertices. Among them, the first two introduce new interpolation points (exo-vertices) while the last two are subsets of the given clusters (endo- vertices). We perform experiments using three real-world problems coming from the multiphysics code LS-DYNA. The pivot selection strategies are compared in term of quality (final rank) and efficiency (size of the initial grid). These benchmarks demonstrate that overall, maximally-dispersed vertices provide an accurate and efficient sets of pivots for most applications. It allows to reach near-optimal ranks while starting with relatively small sets of vertices, compared to other strategies.
△ Less
Submitted 12 July, 2018;
originally announced July 2018.
-
Fast Low-Rank Kernel Matrix Factorization through Skeletonized Interpolation
Authors:
Léopold Cambier,
Eric Darve
Abstract:
Integral equations are commonly encountered when solving complex physical problems. Their discretization leads to a dense kernel matrix that is block or hierarchically low-rank. This paper proposes a new way to build a low-rank factorization of those low-rank blocks at a nearly optimal cost of $\mathcal{O}(nr)$ for a $n \times n$ block submatrix of rank r. This is done by first sampling the kernel…
▽ More
Integral equations are commonly encountered when solving complex physical problems. Their discretization leads to a dense kernel matrix that is block or hierarchically low-rank. This paper proposes a new way to build a low-rank factorization of those low-rank blocks at a nearly optimal cost of $\mathcal{O}(nr)$ for a $n \times n$ block submatrix of rank r. This is done by first sampling the kernel function at new interpolation points, then selecting a subset of those using a CUR decomposition and finally using this reduced set of points as pivots for a RRLU-type factorization. We also explain how this implicitly builds an optimal interpolation basis for the Kernel under consideration. We show the asymptotic convergence of the algorithm, explain his stability and demonstrate on numerical examples that it performs very well in practice, allowing to obtain rank nearly equal to the optimal rank at a fraction of the cost of the naive algorithm.
△ Less
Submitted 6 May, 2019; v1 submitted 8 June, 2017;
originally announced June 2017.