-
A time-frequency method for acoustic scattering with trapping
Authors:
Heather Wilber,
Wietse Vaes,
Abinand Gopal,
Gunnar Martinsson
Abstract:
A Fourier transform method is introduced for a class of hybrid time-frequency methods that solve the acoustic scattering problem in regimes where the solution exhibits both highly oscillatory behavior and slow decay in time. This extends the applicability of hybrid time-frequency schemes to domains with trapping regions. A fast sinc transform technique for managing highly oscillatory behavior and…
▽ More
A Fourier transform method is introduced for a class of hybrid time-frequency methods that solve the acoustic scattering problem in regimes where the solution exhibits both highly oscillatory behavior and slow decay in time. This extends the applicability of hybrid time-frequency schemes to domains with trapping regions. A fast sinc transform technique for managing highly oscillatory behavior and long time horizons is combined with a contour integration scheme that improves smoothness properties in the integrand.
△ Less
Submitted 18 June, 2025;
originally announced June 2025.
-
Efficient algorithms for computing a rank-revealing UTV factorization on parallel computing architectures
Authors:
N. Heavner,
F. D. Igual,
G. Quintana-Ortí,
P. G. Martinsson
Abstract:
The randomized singular value decomposition (RSVD) is by now a well established technique for efficiently computing an approximate singular value decomposition of a matrix. Building on the ideas that underpin the RSVD, the recently proposed algorithm "randUTV" computes a FULL factorization of a given matrix that provides low-rank approximations with near-optimal error. Because the bulk of randUTV…
▽ More
The randomized singular value decomposition (RSVD) is by now a well established technique for efficiently computing an approximate singular value decomposition of a matrix. Building on the ideas that underpin the RSVD, the recently proposed algorithm "randUTV" computes a FULL factorization of a given matrix that provides low-rank approximations with near-optimal error. Because the bulk of randUTV is cast in terms of communication-efficient operations like matrix-matrix multiplication and unpivoted QR factorizations, it is faster than competing rank-revealing factorization methods like column pivoted QR in most high performance computational settings. In this article, optimized randUTV implementations are presented for both shared memory and distributed memory computing environments. For shared memory, randUTV is redesigned in terms of an "algorithm-by-blocks" that, together with a runtime task scheduler, eliminates bottlenecks from data synchronization points to achieve acceleration over the standard "blocked algorithm", based on a purely fork-join approach. The distributed memory implementation is based on the ScaLAPACK library. The performances of our new codes compare favorably with competing factorizations available on both shared memory and distributed memory architectures.
△ Less
Submitted 12 April, 2021;
originally announced April 2021.
-
The Hierarchical Poincare-Steklov (HPS) solver for elliptic PDEs: A tutorial
Authors:
P. G. Martinsson
Abstract:
A numerical method for variable coefficient elliptic problems on two dimensional domains is described. The method is based on high-order spectral approximations and is designed for problems with smooth solutions. The resulting system of linear equations is solved using a direct solver with $O(N^{1.5})$ complexity for the pre-computation and $O(N \log N)$ complexity for the solve. The fact that the…
▽ More
A numerical method for variable coefficient elliptic problems on two dimensional domains is described. The method is based on high-order spectral approximations and is designed for problems with smooth solutions. The resulting system of linear equations is solved using a direct solver with $O(N^{1.5})$ complexity for the pre-computation and $O(N \log N)$ complexity for the solve. The fact that the solver is direct is a principal feature of the scheme, and makes it particularly well suited to solving problems for which iterative solvers struggle; in particular for problems with highly oscillatory solutions. This note is intended as a tutorial description of the scheme, and draws heavily on previously published material.
△ Less
Submitted 3 June, 2015;
originally announced June 2015.
-
Blocked rank-revealing QR factorizations: How randomized sampling can be used to avoid single-vector pivoting
Authors:
P. G. Martinsson
Abstract:
Given a matrix $A$ of size $m\times n$, the manuscript describes a algorithm for computing a QR factorization $AP=QR$ where $P$ is a permutation matrix, $Q$ is orthonormal, and $R$ is upper triangular. The algorithm is blocked, to allow it to be implemented efficiently. The need for single vector pivoting in classical algorithms for computing QR factorizations is avoided by the use of randomized s…
▽ More
Given a matrix $A$ of size $m\times n$, the manuscript describes a algorithm for computing a QR factorization $AP=QR$ where $P$ is a permutation matrix, $Q$ is orthonormal, and $R$ is upper triangular. The algorithm is blocked, to allow it to be implemented efficiently. The need for single vector pivoting in classical algorithms for computing QR factorizations is avoided by the use of randomized sampling to find blocks of pivot vectors at once. The advantage of blocking becomes particularly pronounced when $A$ is very large, and possibly stored out-of-core, or on a distributed memory machine. The manuscript also describes a generalization of the QR factorization that allows $P$ to be a general orthonormal matrix. In this setting, one can at moderate cost compute a \textit{rank-revealing} factorization where the mass of $R$ is concentrated to the diagonal entries. Moreover, the diagonal entries of $R$ closely approximate the singular values of $A$. The algorithms described have asymptotic flop count $O(m\,n\,\min(m,n))$, just like classical deterministic methods. The scaling constant is slightly higher than those of classical techniques, but this is more than made up for by reduced communication and the ability to block the computation.
△ Less
Submitted 29 May, 2015;
originally announced May 2015.
-
A high-order scheme for solving wave propagation problems via the direct construction of an approximate time-evolution operator
Authors:
T. S. Haut,
T. Babb,
P. G. Martinsson,
B. A. Wingate
Abstract:
The manuscript presents a new technique for computing the exponential of skew-Hermitian operators. Principal advantages of the proposed method include: stability even for large time-steps, the possibility to parallelize in time over many characteristic wavelengths, and large speed-ups over existing methods in situations where simulation over long times are required. Numerical examples involving th…
▽ More
The manuscript presents a new technique for computing the exponential of skew-Hermitian operators. Principal advantages of the proposed method include: stability even for large time-steps, the possibility to parallelize in time over many characteristic wavelengths, and large speed-ups over existing methods in situations where simulation over long times are required. Numerical examples involving the 2D rotating shallow water equations and the 2D wave equation in an inhomogenous medium are presented, and the method is compared to the 4th order Runge-Kutta (RK4) method and to the use of Chebyshev polynomials. Is is demonstrated that the new method achieves high accuracy over long time intervals, and with speeds that are orders of magnitude faster than both RK4 and the use of Chebyshev polynomials.
△ Less
Submitted 20 February, 2014;
originally announced February 2014.
-
A direct solver with O(N) complexity for variable coefficient elliptic PDEs discretized via a high-order composite spectral collocation method
Authors:
A. Gillman,
P. G. Martinsson
Abstract:
A numerical method for solving elliptic PDEs with variable coefficients on two-dimensional domains is presented. The method is based on high-order composite spectral approximations and is designed for problems with smooth solutions. The resulting system of linear equations is solved using a direct (as opposed to iterative) solver that has optimal O(N) complexity for all stages of the computation w…
▽ More
A numerical method for solving elliptic PDEs with variable coefficients on two-dimensional domains is presented. The method is based on high-order composite spectral approximations and is designed for problems with smooth solutions. The resulting system of linear equations is solved using a direct (as opposed to iterative) solver that has optimal O(N) complexity for all stages of the computation when applied to problems with non-oscillatory solutions such as the Laplace and the Stokes equations. Numerical examples demonstrate that the scheme is capable of computing solutions with relative accuracy of $10^{-10}$ or better, even for challenging problems such as highly oscillatory Helmholtz problems and convection-dominated convection diffusion equations. In terms of speed, it is demonstrated that a problem with a non-oscillatory solution that was discretized using $10^{8}$ nodes was solved in 115 minutes on a personal work-station with two quad-core 3.3GHz CPUs. Since the solver is direct, and the "solution operator" fits in RAM, any solves beyond the first are very fast. In the example with $10^{8}$ unknowns, solves require only 30 seconds.
△ Less
Submitted 10 July, 2013;
originally announced July 2013.
-
A composite spectral scheme for variable coefficient Helmholtz problems
Authors:
P. G. Martinsson
Abstract:
A discretization scheme for variable coefficient Helmholtz problems on two-dimensional domains is presented. The scheme is based on high-order spectral approximations and is designed for problems with smooth solutions. The resulting system of linear equations is solved using a direct solver with O(N^1.5) complexity for the pre-computation and O(N log N) complexity for the solve. The fact that the…
▽ More
A discretization scheme for variable coefficient Helmholtz problems on two-dimensional domains is presented. The scheme is based on high-order spectral approximations and is designed for problems with smooth solutions. The resulting system of linear equations is solved using a direct solver with O(N^1.5) complexity for the pre-computation and O(N log N) complexity for the solve. The fact that the solver is direct is a principal feature of the scheme, since iterative methods tend to struggle with the Helmholtz equation. Numerical examples demonstrate that the scheme is fast and highly accurate. For instance, using a discretization with 12 points per wave-length, a Helmholtz problem on a domain of size 100 x 100 wavelengths was solved to ten correct digits. The computation was executed on an office desktop; it involved 1.6M degrees of freedom and required 100 seconds for the pre-computation, and 0.3 seconds for the actual solve.
△ Less
Submitted 19 June, 2012;
originally announced June 2012.
-
A high-order Nystrom discretization scheme for boundary integral equations defined on rotationally symmetric surfaces
Authors:
P. Young,
S. Hao,
P. G. Martinsson
Abstract:
A scheme for rapidly and accurately computing solutions to boundary integral equations (BIEs) on rotationally symmetric surfaces in R^3 is presented. The scheme uses the Fourier transform to reduce the original BIE defined on a surface to a sequence of BIEs defined on a generating curve for the surface. It can handle loads that are not necessarily rotationally symmetric. Nystrom discretization is…
▽ More
A scheme for rapidly and accurately computing solutions to boundary integral equations (BIEs) on rotationally symmetric surfaces in R^3 is presented. The scheme uses the Fourier transform to reduce the original BIE defined on a surface to a sequence of BIEs defined on a generating curve for the surface. It can handle loads that are not necessarily rotationally symmetric. Nystrom discretization is used to discretize the BIEs on the generating curve. The quadrature is a high-order Gaussian rule that is modified near the diagonal to retain high-order accuracy for singular kernels. The reduction in dimensionality, along with the use of high-order accurate quadratures, leads to small linear systems that can be inverted directly via, e.g., Gaussian elimination. This makes the scheme particularly fast in environments involving multiple right hand sides. It is demonstrated that for BIEs associated with the Laplace and Helmholtz equations, the kernel in the reduced equations can be evaluated very rapidly by exploiting recursion relations for Legendre functions. Numerical examples illustrate the performance of the scheme; in particular, it is demonstrated that for a BIE associated with Laplace's equation on a surface discretized using 320,800 points, the set-up phase of the algorithm takes 1 minute on a standard laptop, and then solves can be executed in 0.5 seconds.
△ Less
Submitted 30 December, 2011;
originally announced January 2012.
-
High-order accurate Nystrom discretization of integral equations with weakly singular kernels on smooth curves in the plane
Authors:
S. Hao,
A. H. Barnett,
P. G. Martinsson,
P. Young
Abstract:
Boundary integral equations and Nystrom discretization provide a powerful tool for the solution of Laplace and Helmholtz boundary value problems. However, often a weakly-singular kernel arises, in which case specialized quadratures that modify the matrix entries near the diagonal are needed to reach a high accuracy. We describe the construction of four different quadratures which handle logarithmi…
▽ More
Boundary integral equations and Nystrom discretization provide a powerful tool for the solution of Laplace and Helmholtz boundary value problems. However, often a weakly-singular kernel arises, in which case specialized quadratures that modify the matrix entries near the diagonal are needed to reach a high accuracy. We describe the construction of four different quadratures which handle logarithmically-singular kernels. Only smooth boundaries are considered, but some of the techniques extend straightforwardly to the case of corners. Three are modifications of the global periodic trapezoid rule, due to Kapur-Rokhlin, to Alpert, and to Kress. The fourth is a modification to a quadrature based on Gauss-Legendre panels due to Kolm-Rokhlin; this formulation allows adaptivity. We compare in numerical experiments the convergence of the four schemes in various settings, including low- and high-frequency planar Helmholtz problems, and 3D axisymmetric Laplace problems. We also find striking differences in performance in an iterative setting. We summarize the relative advantages of the schemes.
△ Less
Submitted 21 November, 2012; v1 submitted 29 December, 2011;
originally announced December 2011.
-
A fast solver for Poisson problems on infinite regular lattices
Authors:
A. Gillman,
P. G. Martinsson
Abstract:
The Fast Multipole Method (FMM) provides a highly efficient computational tool for solving constant coefficient partial differential equations (e.g. the Poisson equation) on infinite domains. The solution to such an equation is given as the convolution between a fundamental solution and the given data function, and the FMM is used to rapidly evaluate the sum resulting upon discretization of the in…
▽ More
The Fast Multipole Method (FMM) provides a highly efficient computational tool for solving constant coefficient partial differential equations (e.g. the Poisson equation) on infinite domains. The solution to such an equation is given as the convolution between a fundamental solution and the given data function, and the FMM is used to rapidly evaluate the sum resulting upon discretization of the integral. This paper describes an analogous procedure for rapidly solving elliptic \textit{difference} equations on infinite lattices. In particular, a fast summation technique for a discrete equivalent of the continuum fundamental solution is constructed. The asymptotic complexity of the proposed method is $O(N_{\rm source})$, where $N_{\rm source}$ is the number of points subject to body loads. This is in contrast to FFT based methods which solve a lattice Poisson problem at a cost $O(N_Ω\log N_Ω)$ independent of $N_{\rm source}$, where $Ω$ is an artificial rectangular box containing the loaded points and $N_Ω$ is the number of points in $Ω$.
△ Less
Submitted 30 December, 2011; v1 submitted 17 May, 2011;
originally announced May 2011.